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PREDICTION OF SENIOR HIGH-SCHOOL SUCCESS AT 
VARIOUS LEVELS OF INTELLIGENCE 


ROYAL B. EMBREE, JR. 
University of Minnesota 


This investigation was undertaken to discover whether the effi- 
ciency of certain measures used in the prediction of high-school success 
differs as these devices are used with varying levels of intelligence. 
The three bases of prediction considered were honor-point ratio for the 
ninth grade, 1Q, and age at entering the ninth grade. The measure 
of senior high-school success was the honor-point ratio for all marks 
received during the tenth, eleventh, and twelfth grades. An IQ range 
of sixty points was arbitrarily divided into three parts to furnish the 
levels of intelligence used. 

A supplementary purpose of the study was to consider the relative 
worth of the measures used, and to arrive at a reasonably accurate 
basis for the prediction of senior high-school success. 


DESCRIPTION OF THE INVESTIGATION 


Material for the investigation was accumulated from the records 
of University High School, Minneapolis, Minnesota. This institu- 
tion, the practice school of the College of Education, University of 
Minnesota, is a junior-senior high school enrolling approximately 
four hundred pupils in grades seven through twelve. Two hundred 
seventy-one subjects were included in the study, each of whom had 
complete records for the ninth, tenth, eleventh, and twelfth grades, 
and had graduated from the institution. The investigation covered 
the period from 1928 to 1935. The group was almost equally divided 
between those who entered University High School at the ninth grade 
and those who entered earlier, in either the seventh or eighth grade. 
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The two hundred seventy-one cases included approximately equal 
numbers of boys and girls, which is the typical distribution of the 
school. 

It has already been mentioned that the two hundred seventy-one 
subjects were divided into three groups representing different levels of 
intelligence. The lines of demarcation were set at IQ 110 and IQ 130, 
and a range of twenty points was allowed for each group. Group I 
included sixty-seven cases lying between IQ 90 and IQ 109; Group II 
contained one hundred forty-two cases lying between IQ 110 and 
TQ 129; and Group III contained sixty-two cases lying between IQ 130 
and 1Q 149. Group I-II-III, which embraces all subjects, included 
two hundred seventy-one cases lying between IQ 90 and IQ 149. 
Two cases with IQ’s above 150 were excluded from the investigation 
because they lay outside the prescribed limits and it was feared that 
they would influence Group III unduly. The mean and standard 
deviation of IQ for Group I-II-I1I are almost identical with those 
determined for the entire University High School student body over a 
period of several years. 

The measure of senior high-school achievement used was the senior 
high-school honor-point ratio, which is variable y in all tables. This 
was calculated upon the basis of all courses taken in the tenth, eleventh, 
and twelfth grades, with the exception of music and physical education. 
Final marks for each of nine quarters were utilized, letter grades being 
assigned the following values: A = 3, B = 2,C = 1,D =0,F = —1. 
The honor-point ratio was secured by dividing the total number of 
quarter credits into the total number of honor points earned. 

Three independent variables were used in the prediction of senior 
high-school achievement. These were ninth-grade honor-point ratio, 
termed variable x:, a measure of intelligence, termed variable x2, and 
age at entering the ninth grade, termed variable 73. 

Ninth-grade honor-point ratio was derived in the manner described 
above. The same values were assigned to the letter grades, and marks 
for three quarters were included. The measure of intelligence used 
in the investigation was the median IQ of five standard group intelli- 
gence tests with results equated upon the basis described by W. S. 
Miller. Age at entering ninth grade was the chronological age in 
months of each subject at mid-September of the year in which he 
began ninth-grade work. The mean and standard deviation of each 
variable for the three individual groups and the combined group are 
shown in Table I. 
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TasLe I.—Means anv Stanparp Deviations, For Eacu Group, oF VARIABLES y 
(Senior Hicu-scHoot Honor-pornt Ratio), 2, (NINTH-GRADE Honor- 
POINT Ratio), 22 (IQ), AND z; (NINTH-GRADEB AGB) 











Group I Group II Group III Group I-II-III 
67 142 62 271 
M SD M SD M SD M SD 
y 96 54| 1.45] .59 1.99] .59 1.45 68 
1 81 55 | 1.35] .60 1.87| .56 1.34 68 
z, | 103.78| 4.59 | 119.44] 5.48 | 137.05] 4.36 | 119.57] 12.38 
z, | 176.85 | 10.25 | 169.31 | 6.19 | 164.71 | 8.86 | 170.32| 9.35 





























The investigation is divided into three parts. The first involves 
comparison of the groups representing various levels of intelligence 
upon the bases of several zero order correlations, multiple correlations 
with two and with three independent variables, and correlations 
between honor-point ratios for senior high school and for ninth grade 
with the influence of IQ controlled by the partial correlation technique. 

The second part of the study deals with the predictive measures 
as applied to the combined group, Group I-IIJ-III. Regression equa- 
tions are furnished both with and without the ninth-grade age variable, 
and some sample predictions are included. 

The third part of the study is concerned with two problems which 
came to light during the investigation. These are the difference 
between honor-point ratios for the ninth grade and for senior high 
school, and the differences existing between students who entered the 
school at the ninth grade and those who began at either the seventh 
or eighth grade. 


COMPARISON OF THE EFFICIENCY OF PREDICTION FOR THE 
THREE GROUPS 


The predictive value of ninth-grade honor-point ratio, IQ, and 
age at entering the ninth grade was determined by multiple correlation 
for each of the three groups, as was the multiple correlation for two 
variables, ninth-grade age being excluded in this case. Further 
comparison was made by computing for each group the correlation 
between ninth-grade and senior high-school honor-point ratios with 
the effect of IQ controlled by partial correlation. The three groups 
were also compared upon the basis of each of the six inter-relationships 
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determined in the study, as expressed by the correlations of senior 
high-school honor-point ratio with ninth-grade honor-point ratio, 1Q, 
and ninth-grade age; of ninth-grade honor-point ratio with IQ and 
ninth-grade age; and of IQ with ninth-grade age. Each individual 
group was also compared with the combined group upon all the bases 
mentioned above. 

The zero order correlation coefficients for each of the four groups, 
and the multiple and partial correlation coefficients are shown in 
Table II. The means and standard deviations of the variables have 
already been given. 


Tasie II].—CorrE.aTION COEFFICIENTS EXPRESSING THE VARIOUS 

INTER-RELATIONSHIPS AMONG VARIABLES y (SENIOR HIGH-SCHOOL 

Honor-point Ratio), 2: (NINTH-GRADE HONOR-POINT RatTIO), 22 
(IQ), AND z; (NINTH-GRADE AGE) 

















Group I Group II Group III Group I-II-III 

67 142 62 271 
a .794 + .045 .801 + .030 .745 + .056 .853 + .017 
yes .249 + .115 .408 + .070 .180 + .123 .596 + .039 
Tune .077 + .121 | —.106 + .084 .067 + .126 | —.244 + .057 
Posse .285 + .112 .408 + .070 .195 + .122 .426 + .050 
Paws .094 + .121 .052 + .084 .150 + .124| —.199 + .058 
fs, | —.426 + .100| —.300 + .076 | —.078 + .126| —.527 + .044 
Ry.s,233 .795 + .045 .806 + .029 .747 + .056 .893 + .012 
Ry.:s3 .794 + .045 .806 + .029 .746 + .056 .891 + .012 
Peete .779 + .048 .761 + .035 .736 + .058 .823 + .020 








The correlations between age and achievement in both the ninth 
grade and senior high school range about zero for the individual groups 
and are significantly negative for Group I-II-III. Definitely negative 
correlations were found between age and IQ for Groups I, II, and 
I-II-III. In the case of Group III, which included pupils between 
IQ 130 and IQ 149, this relationship, although negative, is not signifi- 
cantly different from zero. 

The relationships between IQ and honor-point ratios for ninth 
grade and senior high school are marginally significant for Group I, 
definitely significant for Group II and Group I-II-IiI, and not signifi- 
cantly different from zero for Group III. The correlations between 
ninth-grade achievement and senior high-school achievement, as 
measured by honor-point ratios, are high for all groups. 
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It seems probable that the relatively low correlations between 
IQ and achievement in the cases of Groups I, II, and III are due, in 
part at least, to the restriction of range imposed by grouping subjects 
within the narrow IQ classifications. The standard deviations 
of IQ distributions for Groups I and III are smaller than that for 
Group II. The Z test of Fisher? for determining the significance of 
differences between standard deviations was applied to each of these 
differences. The ratio of differences to their standard errors is 1.68 
for Groups I and II, 2.12 for Groups II and III. While these figures 
do not indicate marked significance, there would appear to be a tend- 
ency toward greater homogeniety of IQ in the IQ 110 to IQ 129 range 
than in the 1Q 90 to IQ 109 and IQ 130 to IQ 149 ranges. 

In order to determine whether there are real differences in the 
efficiency of prediction at the various levels of intelligence, all coeffi- 
cients of correlation were transformed into Z equivalents following 
the method described by Fisher.? For each relationship discovered 
in the study, differences were determined among Groups I, II, and III, 
and between Group I-II-III and Group I, Group II, and Group III, 
respectively. The standard error, the ratio of the difference to its 
standard error, and the Pearson p value representing the probability 
of occurrence by chance were computed for each difference. Results 
of comparing the restricted groups appear in Table III, while com- 
parisons are made between the combined group and each individual 
group in Table IV. 

There are no definitely significant differences among Groups I, 
II, and III for any relationship. Differences are smallest between 
Groups I and II, and do not consistently favor either. The differences 
are somewhat larger between Groups II and III, though none is even 
marginally significant. Differences are slight between Groups I and 
III except for one, between the correlation coefficients for ninth-grade 
age and IQ, which is marginally significant. When multiple correla- 
tion coefficients are considered, both with and without the third inde- 
pendent variable, ninth-grade age, Group II is slightly superior to 
both Group I and Group III, and Group I superior to Group III. 
However, but one difference found approaches even the lowest level 
of significance. Partial correlations, with IQ held constant, disclose 
even slighter differences between the groups. 

Comparisons between Group I-II-III, with its tendency toward 
broader dispersion in all variables, and the individual groups produces 
the expected results. In all relationships except those involving the 
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indecisive age variable, the combined group is superior to all individual 
groups. Six of fifteen such differences fall within the .01 limit of 
significance, ten within the .05 limit of significance. When intelli- 
gence, as measured by IQ, is held constant by partial correlation, 
differences between Group I-II-III and its constituent parts decrease 
below the marginal level of significance, but they continue to favor 
the larger group. 


TaBLe III.—ComPaRISON OF DIFFERENCES AMONG THE CORRELATION 
CoEFFICIENTS, TRANSFORMED INTO Z EQuIvALENTs, oF Groups J, II, 








AND III 
Differ- Differ- Differ- 
Zu ZI pa ¥ Zn | 211 aah P Zi | 2m nas P 
Tys; 1.10 | 1.08 .02 -90 | 1.10 -96 .14 .37 | 1.08 -96 12 .51 
Tyan 43 .25 .18 .23 .43 .18 .25 -ll .25 .18 .07 .70 
Tye, — .02 .08 .10 .51 |—.02 .07 .09 . 56 .08 .07 -O1 95 
Pe30s 43] .29] .14 | .35] .43| .20] .23 | .14] .29] .20] .09 | .62 
T2053 .05 .09 .04 .79 .05 15 .10 .52 .09 .15 .06 74 
T2933 —.31 |—.45 .14 .35 |—.31 |—.08 23 .14 |—.45 |—.08 .37 .04 
Ry.s,290,| 1.12 | 1.08 | .04 | .79| 1.12] .97] .15 | .33]1.08| .97] .11 | .54 
Ryo, | 1.12] 1.08| .04 | .79] 1.12] .96] .16 | .30]} 1.08] .96| .12 | .61 
Tya;.03 1.00 | 1.04 .04 .79 | 1.00 .94 .06 -70 | 1.04 .94 -10 .58 









































TaBLE IV.—ComPaRISON OF DIFFERENCES BETWEEN THE CORRELATION 
CoEFFICIENTS, TRANSFORMED INTO Z EqQuivaLents, oF Group I-II-III 
anD Group I, Group II, anp Group III, Respective.ty 








Differ- Differ- Differ- 

21-11-11 | Z1 pe P | 2Z-n-m| 20 ‘sini P } 23-11-11 | Zm1 paren P 
ys, 1.27 1.08) .19 -17| 1.27 1.10) .17 sue. 3 .96; .31 .08 
ys, .69 .25| .44 .01 .69 .43} .26 .01 .69 -18) .51 .01 
Tens — .25 .08} .33 .02} —.25 |—.02) .23 .03} —.25 .07| .32 .08 
Pa4%,5 .45 .29; .16 25 .45 43} .02 .85 .45 .20} .26 .08 
Pa193 — .20 09} .29 | .04) —.20 .05} .25 | .02} —.20 | .15] .35 | .02 
Poses —.59 |—.45] .14 | .31] —.89 |—.31] .28 | .o1| —.59 |—.08] .51 | .01 
Ry.oysysy| 1.44 | 1.08) .36 | .01] 1.44 | 1.12) .32 | .01] 1.44 | .97] .47 | .01 
Ry.0,%5 1.43 1.08) .35 -O1; 1.43 1.12} .31 Ol]; 1.43 -96| .47 + .O1 
Tya;.25 1.17 1.04, .13 .35} 1.17 1.00} .17 -10} 1.17 .94) .23 -1l 









































It may be concluded that there are no significant differences 
between the three levels of intelligence with regard to the efficiency 
of the predicting variables or the isolated relationships. There is a 
tendency for correlations between variables to be less decisive in the 
case of Group III, which included only exceptional pupils between 


Prediction of Senior High-school Success 87 


IQ 130 and IQ 149. The correlations between IQ and honor-point 
ratios for both the ninth grade and senior high school do not differ 
significantly from zero in this case, nor does the correlation between IQ 
and ninth-grade age. Neither IQ nor age exert real differentiating 
influence upon the achievement of this group within its boundaries. 


THE GENERAL PREDICTIVE MEASURE 


This phase of the investigation was directed toward the creation 
of a basis for predicting senior high-school achievement which would be 
sufficiently accurate for use in the guidance of pupils. ‘Two bases for 
the prediction of senior high-school honor-point ratio were considered. 
The first made use of three independent variables, ninth-grade honor- 
point ratio, 1Q, and ninth-grade age. The second was based upon 
ninth-grade honor-point ratio and IQ only. Group I-II-III was used 
in both cases. 

The standard regression coefficients and the multiple regression 
equation in raw score form for variables z, (ninth-grade honor-point 
ratio), z2 (IQ), and 2; (ninth-grade age) on dependent variable y (senior 
high-school honor-point ratio) are as follows: 


B,, = .72988 B,, = .32196 B,, = .07076 
y = —2.51347 + .72881z, + .017682. + .00514z; 
Standard Error of Estimate, y.z:7273 = .306 


Standard regression coefficients and the multiple regression equa- 
tion for variables zx; and x2 on variable y are as follows: 


B,, = .73207 Br. = .28371 
y = —1.39387 + .73100z, + .015582, 
Standard Error of Estimate, y.rz;22. = .309 


The small independent contribution made to prediction by the 
ninth-grade age variable is evidenced by its regression coefficient and 
by the fact that its exclusion from multiple correlations, as shown in 
Table II, makes no appreciable difference in the value of these relation- 
ships. Further evidence that age is of no practical value to such a 
formula in this setting is given in Table V, which contains sample 
predictions for five pupils selected from University High School. 
Regression equations with and without the age variable are used, and 
their results are directly comparable. 

It is evident that two variables, ninth-grade honor-point ratio 
and IQ, supply a rather accurate measure for predicting senior high- 
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school achievement. The multiple correlation coefficient representing 
the combined effect of these two variables is .891. The standard error 
of estimate of y from 2, and 22 is .309. Like most measures of this 
type, the equation is far from absolute. However, its precision ranks 
well in comparison with similar instruments for prediction, and it may 
be of definite value in guidance and personnel work. 


TaBLE V.—Prepictep Senior HicuH-scHoot Honor-pornt Ratios CompurTEep 
with Two AND THREE VARIABLES 








* m ma Predicted y | Predicted y Actual y 
(Xi, T2, Zs) (%1, 22) 
A 2.50 143 165 2.68 2.66 2.50 
B 2.50 127 169 2.42 2.42 2.44 
C . 67 100 169 .61 65 1.16 
D .83 123 169 1.14 1.13 . 26 
E .42 QQ 184 .49 .46 .39 























OTHER PROBLEMS ARISING FROM THE INVESTIGATION 


The mean honor-point ratios for each individual group and for the 
combined group were higher for the senior high-school years than for 
the ninth grade. These differences were tested by the formula for the 
standard error of the differences between the means of correlated 
measures. The results are given in Table VI. 

The mean honor-point ratio is uniformly higher in the senior 
high school. For Groups I, II, and I-II-III, the differences are highly 
significant. For Group III, composed of exceptional children, the 
difference is only marginally significant. There is evident in Uni- 


Taste VI.—CoMPaRISON, BY THE DIFFERENCE BETWEEN MEANs, OF SENIOR 
HIGH-SCHOOL AND NINTH-GRADE Honor-point Ratios, wirH CHANCES 
1n OnE TuHovusanD Tuat True DirreRENcE Lies IN THE DIRECTION 











FounpD 
: : 2 k , Chances 
Senior | Ninth | Differ- |SDdif-| Difference : 
ference | SD difference | 
y * oe thousand 
RI ey ret . 96 81 .15 .043 3.47 999 
Sry 1.45 1.35 .10 .032 3.14 999 
|, 1.99 |} 1.87 12 .052 2.31 989 
Group I-II-III....... 1.45 | 1.34] .11 .022 5.00 999 
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versity High School a definite tendency for marks to increase from the 
ninth grade to senior high school. This can not be due to elimination, 
for all subjects included in the study completed their senior years and 
received diplomas of graduation. 

In order to determine whether or not this difference in achievement 
results from poor adjustment on the part of pupils entering the school 
for the first time at the ninth-grade level, the comparisons shown in 
Table VII were made. Eighty-seven pupils who had entered Uni- 
versity High School in seventh or eighth grade were selected in order 
from subjects of the investigation and compared with eighty-four 
similarly selected individuals who had begun work in the ninth grade. 
Comparisons were based upon IQ and honor-point ratio for the ninth 
grade. 

The old pupils proved superior to the new in ninth-grade achieve- 
ment, but the difference is not even marginally significant. On the 
other hand, these pupils who entered the seventh and eighth grades in 
beginning work at University High School were significantly superior 
to those who entered the ninth grade in intelligence, as measured by 
IQ. The ratio of the difference to its standard error is not 3.0, but 
chances that the true difference lies in the direction of the discovered 
difference are better than ninety-nine in one hundred. 

TaBLE VII.—ComMPaRISON, BY THE DIFFERENCE BETWEEN MEANS, OF THE 

Honok-Point Ratios (z:) anp IQ’s (z:) or NintTuH-GrapE Purits New 

TO THE ScHOOL AND THOosE Havinc BgEN ENROLLED ONE OR Two 


Years, wirh CHANCES IN ONE THOUSAND THAT TRUE DIFFERENCE 
Lies IN THE DrrEcTION FounD 











Old | New | Differ-|SDdif-| Difference | Chances 
pupils | pupils | ence | ference! SD difference Prpeiarscoits 
Mean 2........... 1.44 1.31 13} .101 1.29 901 
SD i ee ie epee asada .63 ° 68 
Mean z:...........| 121.20 | 116.64 | 4.56 | 1.75 2.61 996 
ns 8 alain 11.34 11.49 























There is, then, no real difference between old and new pupils in 
ninth-grade achievement. However, there exists a definite tendency 
for those students who enter this experimental high school during its 
first two years to be superior in IQ to those who enter at the ninth 
grade. An adequate explanation can not be given for this tendency 
without further study. One possible explanation is that professional 
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people able to pay the fees of such an institution and anxious to have 
this training begin at an early date may tend to have brighter children 
than parents who decide upon this training at a later grade level. 
This assumption could be tested only by comparing the home back- 
ground and parental status of the two groups of pupils. A second, 
and probably more likely explanation, is that brighter children may 
tend to attract attention earlier, both in elementary school and home, 
thus causing their parents to feel at an earlier date that they can 
profit by specialized schooling. 


SUMMARY AND CONCLUSIONS 


Two hundred seventy-one pupils who had attended University 
High School from the ninth grade or before through the twelfth grade 
were divided into three groups upon the basis of intelligence as meas- 
ured by five group tests. Group I, containing sixty-seven cases, 
represented a range from IQ 90 to IQ 109; Group II, containing 
one hundred forty-two cases, a range from IQ 110 to IQ 129; Group III, 
containing sixty-two cases, a range from IQ 130 to IQ 149. The 
groups were compared to determine whether there existed differences 
between one and another with regard to the efficiency with which senior 
high-school achievement could be predicted by ninth-grade achieve- 
ment, IQ, and age at entering the ninth grade. 

Upon the basis of the total number of cases, termed Group I-II-III, 
regression equations and multiple correlation coefficients including the 
age variable were compared with the same functions with this variable 
excluded. 

Lastly, ninth-grade and senior high-school honor-point ratios were 
compared for all groups, and pupils entering University High School 
at the ninth grade were compared with those who entered at the 
seventh and eighth grades upon the bases of IQ and ninth-grade 
achievement. 

The results of this investigation may be summarized in the follow- 
ing conclusions. 

1. No significant differences were found among three levels of 
intelligence with regard to the efficiency with which senior high-school 
achievement may be predicted by ninth-grade achievement, IQ, and 
ninth-grade age. 

2. There is indication of a tendency for the inter-relationships 
among the four variables to be less decisive in the case of pupils above 
IQ 130 than for either of the lower groups. 
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3. The use of age at entering ninth grade as a variable for predicting 
senior high-school success does not appreciably affect results as shown 
by multiple correlation coefficients and regression equations. This 
measure does not make sufficient independent contribution to justify 
its use. 

4. The investigation gives evidence that in University High School, 
pupils’ marks are higher in the tenth, eleventh, and twelfth grades 
than in the ninth grade. 

5. The difference between the ninth-grade achievement of pupils 
entering the school for the first time at that level and those who had 
begun in either seventh or eighth grade is not significant. 

6. Pupils who entered at the seventh or eighth grade were signifi- 
cantly superior in IQ to those who began work at the ninth grade. 

Generalization upon the basis of this investigation is limited by 
two factors. The first and most important is that University High 
School represents a student body of superior intelligence. It was 
impossible to measure the efficiency of prediction at levels below IQ 90. 
However, this should not detract from the results determined upon the 
groups used, for they may be considered good samples of their various 
levels of intelligence. While there was no opportunity to study pupils 
of low average mental ability, there did exist an excellent opportunity 
to draw a representative sample of exceptional children. The second 
limiting factor is the number of cases included in Groups I and III. 
This matter must be decided arbitrarily, and, while there were but 
sixty-seven and sixty-two cases, respectively, in these groups, they are 
carefully drawn samples, and no statistical abnormality is reflected 
by any data secured during the investigation. 
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THE EFFECT OF REPEATED PRAISE OR BLAME ON 
THE PERFORMANCE OF INTROVERTS AND 
EXTROVERTS* 


GEORGE FORLANO 


Teachers College, Columbia University 
AND 
HYMAN C. AXELROD 


Psychologist, Hebrew National Orphan Home 


In previous investigations of motivation attempts had been made 


to measure the effects of various incentives on learning and perform- ** 


ance. Rivalry as a motive was studied by Whitemore,'? Hurlock,’ 
and Maller. Their results indicate that both children and adults 
respond to competition as an incentive with a marked increase in 
performance. Praise and reproof as incentives were investigated by 


Gilchrist,‘ Gates and Rissland,*? Hurlock*’ and Brenner.? Their } 
~ results however are not consistent. Gilchrist found praise more 


effective than reproof with college students. Gates and Rissland 
report slight differences in the average improvement of three groups 
of college students who were subjected to praise, reproof, and indiffer- 
ence. Hurlock® finds no significant difference between reproof and 
praise in connection with the performance of elementary-school 
children, but in a later study,® where four forms of motivation were 
applied, the order of effectiveness of the incentives was found to be 
praise, reproof, indifference, and control. Brenner not only finds 
blame and indifference more effective than praise but that blame works 
more for the integration of learning and recall. 


Conflicting results of the various studies may be due to different | 


conditions under which the experiments were conducted as well as to 
the kind and intensity of the incentives used. But, although investi- 
gators have tried to control the factors which enter into the experi- 
mental situation in the form of external stimuli they have overlooked 
the importance of the_effeet-of_personality differences on responses 
made to these stimuli. The effectiveness with which one responds to 
any incentive is not only determined by one’s mental capacity but is 
also influenced by his temperamental habits and attitudes. 





*The authors wish to thank Professor Rudolf Pintner for many helpful 
suggestions. 


92 








—- oop gm wt & ot Pee Pe? bt FF’ OE 








ON 





‘er- 


ful 


Effect of Repeated Praise or Blame 93 


In any learning situation affective disturbances as excitement, fear, 
or timidity can and do influence the efficiency and speed of the learner. 
In the ordinary classroom situation such personality handicaps usually 
become apparent. Pupils of high intelligence who are disposed to 
restlessness and excitability often lag behind in educational achieve- 
ment. Such disparity between mental ability and achievement has 
been shown by Rogers” to be significantly related to perseveration. 
Bird! found that children with such personality handicaps as fear and 
introversion required three times as many trials as a ‘‘normal”’ group 
in learning to trace letters and words. The findings of McGeoch and 
Whitely® suggest that learning and recall may be related to introversion 
and submissiveness as measured by a standardized scale. Triplett" 
reports that young, nervous, and excitable subjects prone to over- 
stimulation through rivalry are less efficient in competitive perform- 


ance. These experimental findings indicate the important réle that 


temperament plays in any learning situation. 

Therefore, the effectiveness of any incentive depends not only on 
the set-up of the experiment and the intellectual level of the group but 
is also conditioned by the emotional states and personality differences 
of the subjects. (The implicit assumption that seems to be made in 
most studies, namely, that all individuals of the same age and intelli- 
gence respond alike to an incentive is untenable. What is intended as 
praise may be accepted as such by some individuals and actively 
resented by others. The shy person may be stimulated by social 
approval to a more marked degree than the aggressive type. An 
introvert may be entirely indifferent to blame whereas the extrovert 
may be aroused to greater activity by words of disapproval. Differ- 
ences in personality may produce a wide variation in the motivation 
of different members of the same group resulting in wide divergencies 
in performance. 

The experiment to be reported focuses attention on the problem of 
motivation and its relation to personality ere 


THE PROBLEM 


The purpose of this investigation is to determine experimentally 
the effects of repeated praise or blame on the performances of children 
who, on the basis of their responses to a psychological questionnaire, 
were classified as extroverts or introverts. 
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GROUPS STUDIED 


Pupils of four classes in the fifth grade of a public school in the 
City of New York were selected as subjects for the experiment. Pupils 
from average and bright classes were included. 


TESTS USED 


The psychological questionnaire on the basis of which the experi- 
mental groups were determined was an extroversion inventory devised 
by Pintner and others.* This instrument, as yet unpublished, consists 
of thirty-five self-description items which have been culled from other 
standardized extroversion tests and adapted to children of elementary- 
school level. The reliability of the scale as determined by the re-test 
method is .70. Three weeks elapsed between the initial and the re-test. 
Validity is based on the method of item selection and internal con- 
sistency. Items 5, 10, 15 and 20 of the inventory are given below: 








Item 

ee er Be. a cs inka dbie dices CAs s veccves Same | Different 
10 | I find it easy to start speaking to a new pupil.......>..} Same | Different 
15 |I can be scolded without feeling hurt.................| Same | Different 
Se he Ge Pe te OO UE OI. on cc ccc Scccccccnces Same | Different 














In order to measure the effects of motivation on a simple function 
the Woodworth-Wells Number Cancellation Test was used. This 
test correlates low with intelligence and has a reliability of .80. Three 
similar forms were prepared, each involving the crossing out of the 
number 7. 


FORMATION OF THE EXPERIMENTAL GROUPS 


Several days prior to the experiment the extroversion inventory 
was given to each of the four classes. On the basis of their total scores 
the children in each grade were divided into two groups. Those who 
scored above the median were designated as the E (extrovert) group, 
and those who scored below the median as the J (introvert) group. 





* The extroversion inventory is one of the three sections of The Personality 
Test. The other two sections of the Personality Test consist of items purporting 
to measure ascendance-submission and emotionality. The Personality Test is to be 
published by the World Book Co., Yonkers-on-the-Hudson, New York. 
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This classification can be defended only on the assumption that the 
inventory gives some indication of a child’s habits and attitudes that 
have been found by previous investigators to be symptomatic of 
extroversion or introversion. Since the incentives to be applied 
were praise and blame the classification resulted in the formation 
of four experimental groups designated as EP (extrovert-praise), 
EB (extrovert-blame), JP (introvert-praise), JB (introvert-blame), 
and one control group C. 


EXPERIMENTAL PROCEDURE 


Without making any introductory remarks to the children Form I 
of the Cancellation Test was administered to the experimental groups. 
Pupils were instructed to hold the test blanks face down until the 
directions were fully explained. After the signal to begin canceling 
the sevens was given,thirty seconds were allowed for practice. The 
pupils were then told to draw a circle around the last number crossed 
out and to continue from that point for another two minutes. 

In order to apply individual incentives of praise or blame, the 
examiner called each pupil to the desk to receive a mark from his 
regular teacher, who, regardless of the pupil’s actual performance, 
graded the paper either P (poor) or G (good) according to pre-arranged 
instructions. In the first class the papers of the Introverts were 
marked P and those of the Extroverts were marked G. In the second 
class the procedure was reversed: Extroverts received P and Introverts 
received G. In the third class the pupils were divided into four groups, 
namely, EP, EB, IP, and IB. The third class was divided in this 
manner in order to increasethe number of cases in each of the experi- 
mental groups. When each paper had been marked by the teacher 
the pupils were instructed to hold the tests face down so as not to show 
the grades to others. The examiner then announced that all those 
who had received P did poor work and those who had received G did, 
good work. Thus, each child received individual praise and blame, 
namely, approval or disapproval from the teacher, and indirect group 
approval or disapproval. 

In order to measure the effects of these incentives another form of 
the Cancellation Test was immediately administered. At the end 
of two minutes of work the incentives were applied again by following 
the exact procedure as described above. By giving each child the same 
mark on Form II as on Form I, praise and blame were thereby repeated 
and we believe intensified. 


Wow Tae ee TS 
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A third form of the Cancellation Test was then given in order to 
measure the effects of the repeated incentives. At the end of the third 
testing period, the papers were immediately collected without being 
marked. 

The procedure used with experimental groups was followed with 
the control group except that no incentives were applied. 


TREATMENT OF THE DATA-——-EQUATION OF GROUPS 


The three forms of the Cancellation Test were scored by the 
experimenters. The score was the number of seven’s crossed out 
within the two-minute period. Three scores were available for each 
pupil. The four experimental groups and the control group were 
equated on the basis of their initial cancellation scores. Chronological 
age was not considered because it correlated but —.03 with the cancel- 
lation score. 


Taste I.—Means or EquatTep Groups ON THE INITI4L CANCELLATION TEST 








Group N Mean © Standard deviation 
IB 25 57.98 8.91 
EB 27 56.49 10.29 
IP 27 56.15 10.17 
EP 26 57.05 8.10 
Cc 26 56.46 8.64 














Table I shows the mean scores obtained on the initial cancellation 
test. It will be seen that the largest mean difference is that between 
the JB and IP groups, namely, 1.83. The standard error of this mean 
difference is 2.65 and the standard ratio* is .69. On the basis of these 
results the groups may be considered, roughly, statistically equal. 


COMPARISON OF GROUPS 





The raw scores for each subject on the successive trials of the test 
were transmuted into gains or losses with reference to the score on the 
initial test. The constant number of one hundred was added to each 
difference to avoid negative signs. Therefore, one hundred indicates 
no change at all, a number more than one hundred denotes a gain, and 
a number less than one hundred, a loss. 





* In this article the ratio of the mean difference to its standard error is termed 
the standard ratio in order to distinguish it from the critical ratio of McGaughy 
and the experimental coefficient of McCall. 
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Table II presents the transmuted gains or losses for each of the 
groups. It will be noted that the experimental and the control groups 
show gains on the second and third trials of the Cancellation Test. 


Taste II.—Megans or TRANSMUTED DIFFERENCES BETWEEN CANCELLATION 
Scores on Inrr1au TRIAL AND Eaca or THE SuccEEDING TRIALS 











Trial 1-Trial 2 Trial 1-Trial 3 
Group 

Mean SD Range Mean SD Range 
IB 114.38 7.56 98-128 117.74 9.90 95-143 
EB 111.29 8.43 98-128 119.84 10.32 98-137 
IP 107.05 4.59 97-115 112.60 7.59 94-124 
EP 108.88 6.33 97-124 112.34 7.17 97-130 
C 108.04 4.20 98-116 109.34 6.57 98-119 























That these gains are not due to chance is indicated in Table III. 


The difference between the means of initial and each succeeding 
trial is in each case statistically reliable. The standard ratios range 
from 6.97 to 9.67 for Trial 1-Trial 2 and from 7.26 to 10.02 for Trial 
1-Trial 3. If this experiment were to be repeated we may be practi- 
cally certain that both the experimental and control groups would 
show gains in performance on the second and third trials of the test. 


Tasie II].—ReEwviaBiuities or DIFFERENCES BETWEEN Mzans oF INITIAL TRIAL 


AND EACH OF THE SUCCEEDING TRIALS OF THE CANCELLATION TEST 

















Trial 1-Trial 2 Trial 1-Trial 3 
Group Mean SE, Standard Mean SE aut Standard 

difference = ratio difference . ratio 
IB 14.38 1.51 9.52 17.74 1.98 8.96 
IP 7.05 . 88 8.01 12.60 1.44 8.75 
EB 11.29 1.62 6.97 19.84 1.98 10.02 
EP 8.88 1.24 7.16 12.34 1.41 8.75 
C 8.04 .83 9.67 9.34 1.29 7.26 

















A further examination of Table III reveals consistently greater 
mean differences and standard errors for the B (blame) groups. This 


may indicate that blame not only produced a greater increment in 
performance but also increased the variability of the groups. From 
these results, however, we cannot draw any conclusions as to the rela- 
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tive effectiveness of praise and blame as incentives. In order to 
determine the potency of these motives we must compare the perform- 
ances of the experimental and control groups on both successive trials. 
Such comparisons are indicated in Table IV. 


TaBLE [V.—RELIABILITIES OF DIFFERENCES BETWEEN MEANS OF EXPERIMENTAL 
AND ConTROL Groups ON TRIALS 2 AND 3 OF THE CANCELLATION TEST 











Trial 2 Trial 3 
Groups Mean SE Standard Mean SE.. Standard 
difference - ratio difference _ ratio 
IB-C 6.34 1.72 3.69 8.40 2.36 3.56 
IP-C — .99 1.22 — .83 3.26 1.93 1.69 
EB-C 3.25 1.83 1.78 10.50 2.36 4.45 
EP-C . 84 1.50 . 56 3.00 1.91 1.57 
IB-IP 7.33 1.75 4.19 5.14 2.45 2.10 
EB-EP 2.41 2.03 1.19 7.50 2.43 3.09 
IB-EB 3.09 2.21 1.39 —2.10 2.80 — .75 
IP-EP —1.83 1.52 —1.20 . 26 2.01 .13 























It will be noted that when the JB and EB groups are compared 
with the C or JP and EP groups on Trial 2, certain significant differ- 
ences are revealed. The standard ratios are in each case in favor of 
the JB and EB groups but only two differences are statistically reliable, 
those of JB-C and JB-IP. On Trial 3 we find a similar result: The 
standard ratios are again in favor of the JB and EB groups, indicating 
reliable differences for 7B-C, EB-C and EB-EP. 

When the performances of the JP, EP, and C groups are compared 
no reliable differences are found on the second trial. On the third 
trial, however, the differences are in favor of the JP and EP groups, 
but are not reliable. 

On the basis of these comparisons we may conclude that blame 
was more effective than either praise or control in producing an incre- 
ment in performance on both successive trials of the test. Praise, 
however, begins to show its influence on the third trial’ What the full 
effect of praise would be if the number of trials were increased is an 
interesting problem for further investigation. 

We may now direct our attention to the main problem of this 
investigation which is concerned with the relation between individual 
personality differences and motivated performance. The specific 
question to be answered is: ‘‘Do Extroverts and Introverts differ in 
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their responses to praise or blame?” Again we refer to Table IV 
in which the experimental groups are compared with respect to both 
incentives. The standard ratios for ]B-IP and EB-EP indicate that 
blame produced a greater increment in the performance of the Intro- 
verts on Trial 2 but that when the incentive is repeated the Extroverts 
who were blamed are stimulated to greater activity as shown on Trial 3. 
When direct comparisons are made between J7B-EB and IP-EP the 
differences are low and insignificant but nevertheless reveal the same 
tendency. The Introverts respond more readily to blame after the 
first application, but when the incentive is intensified the Extroverts 
produce the larger gain in performance. It is interesting to note that 
praise has the reverse effect. Extroverts are more responsive to one 
application of praise than the Introverts. With the second applica- 
tion of praise the Introverts slightly surpass the performance of the 
Extroverts. 


SUMMARY AND CONCLUSIONS 


1. Children classified as Introverts and Extroverts on the basis of 
their responses to the Pintner JE Inventory show differences in their 
reactions to social incentives. Blame as a form of motivation is in 
general more effective than praise or indifference. Introverts who 
were blamed made a statistically significant increase over the perform- 
ance of the control group after both the first and second application of 
blame, whereas the extroverts had to be blamed twice before their 
increase over the performance of the control group was statistically 
significant. 

The comparisons of the Introverts-blamed vs. the Introverts- 
praised and Extroverts-blamed vs. the Extroverts-praised again 
indicated that blame produced a greater increase in the performance 
of the Introverts who were blamed once, and that only after the second 
application of blame did the Extroverts-blamed show a statistically 
significant increase over the Extroverts-praised. 

2. In general, praise apparently did not exercise any differentiating 
effect.* 

3. The results of this study indicate that the mere repetition of the 
Cancellation Test under the conditions of the experiment produces a 





*It is well to remember that here we studied the effect of praise and blame 
upon the child’s factual performance; what attitudinal concomitants were pro- 
duced in the child as a result of repeated blame and praise is a subject for further 
investigation. 
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marked improvement in performance on the second and third trials 
of the test. 

4. It is probable that the time limit for the work period was too 
short and that the number of trials was too few to bring out the full 
effects of the incentives. 
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EMOTIONAL FACTORS IN VERBAL LEARNING: 
IV. EVIDENCE FROM REACTION TIME 


HAROLD D. CARTER* 
Institute of Child Welfare, University of California 


INTRODUCTION 


Previous studies?:** have established a relationship between the 
affective characteristics of words and the ease with which they are 
learned. Words classified on the basis of ratings, as P (pleasant), 
I (indifferent), or U (unpleasant), were used in the experiments. The 
subjective classification was supported by evidence from a galvano- 
metric study,? showing that galvanic deflections tend to be high for 
unpleasant words, slightly less high for pleasant words, and relatively 
low for indifferent words. This order of magnitude (U, P — J) differs 
in one respect from the order for ease of learning (P, U — J). Bothin 
the galvanic deflections and in the learning scores the most marked 
differences are those between the J words and the (subjectively) 
emotionally-toned words. This supports the hypothesis that intensity 
of stimuli is a factor favoring learning. But the disagreement between 
the rank orders of P and U words for galvanic deflection and for 
learning suggests a second hypothesis. 

This second hypothesis is that quality of emotional tone also exerts 
an influence upon the learning; for investigating this hypothesis, the 
association-time data reported in the present paper are valuable. 
For study of differences between pleasant and unpleasant words, the 
galvanometer is inefficient as compared with the association-time 
method. The new evidence secured through application of this 
technique appears to give further insight into the previously estab- 
lished trends in the learning. 


RELATION TO THE LITERATURE 


The value of some of the techniques employed in this research has 
been indicated by earlier workers, but full use has not been made of 
the methods. Smith’s" use of both galvanometer and free association 
data was very stimulating, but his procedures for collecting learning 
data were inadequate. Jones’! first used the galvanometer in the 





* The writer is indebted to H. S. Conrad for criticism of the manuscript and 
valuable suggestions, and to H. E. Jones for administrative assistance, criticism, 
and advice throughout the course of the investigation. 

101 





ae a ol eh ee ee ~ wee Oe ee tee 


Seer Ax 











ade 


. 
i 
, 
i 
ye 
Bist 
Reet By 
ae 4 ay 
a St 
a ig 
if 
e 
> : 
‘ 
< 
+ 
: a 
; « 
wT 
Bas 
J 
t 
| 
ia 
es 
mS. 
vx. 
Wa9). 
ae) 
, 
ray * 
sare 
J ' 
1: 
LA 
vt 
Se 
Ps 
bia 
; 
oa! 
+ 
eee) 
y, 7 
2 
ee 
a 
> i 
ay > 
By 
ts | 
oe » j 
+ . 
+ ei 
at 
(ay 
ke 
, ae 
Sey 
a 
a 
Pied». 3 
i be) 
4 : 
\ 
7. 
Bees ' 
1 ae 
x ‘ 
TS 
is 
‘ba 
¥: 
: , 
; 
Peal 
ae : 
we 
-: ,% 
Ron Sd 
ra ae 
‘4 ) 
ee ft he 
ya ty ibe 
trite 5 Rt 
mie, 4 
oe 
ae 
‘nF 
i)? ae 
A 
< q 
pe H 
ex, i 1% 
7 <a 43 
ie si 
ate fae 
Ame 
fh Mi 
Fi 
4 


# 


ve 


it 
ws, 





102 The Journal of Educational Psychology 


study of reliable learning data; he recommended that further studies 
be made with more extensive samplings of words. The free association 
method has been recommended':'5.18.19 as one of the best techniques 
for measurement of emotions, but it has received only very limited 
use in the work upon affective factors in learning, in the studies by 
Tolman”®:?! and Smith.4* From published reviews?:*-5-"* it is apparent 
that very few comprehensive studies have been made upon the problem 
of affective factors in learning. 

The present series of studies differs from earlier experiments not 
in the use of any one technique, but in the fact that the available 
procedures for securing introspective and experimental data have been 
modified and combined in the study of the same group of subjects. 
This group includes about one hundred children who are being exam- 
ined repeatedly by means of rating procedures, galvanometer tests, 
free association tests, and individual learning tests. For needed con- 
trol experiments, several other large groups of subjects are available. 


THE DATA 


Free association tests were given in the usual manner* to a group 
of one hundred children (fifty boys and fifty girls) who were in the 
sixth and seventh grades of the public schools of Oakland, California, 
at the time of first testing (Spring, 1933). The tests were given five 
times at half-year intervals, with new word-stimuli at each sitting. 
The procedure parallels that of the learning experiment previously 
described?:*:4; the words employed here are those included in the 
learning tests and listed in an earlier report. Equal numbers of words 
were included in the three categories, Pleasant, Indifferent, and 
Unpleasant. The earlier reports describe in detail the procedures 
used to eliminate differences in familiarity or difficulty of the words 
in the different categories. The pleasant group includes such words 
as love, kiss, and happy; these are rather uniformly rated as pleasant. 
The indifferent group includes such words as glass, walk, and cloudy; 
these are rather uniformly rated as indifferent. In the unpleasant 
group are such words as insult, stink, and coward, which are quite 
generally rated as unpleasant. In the report in which the complete 
list of words is published‘ evidence is given to show that the words 
fall into three well-separated groups. 
kK’ In the free-association tests, the children were instructed to reply 
to each word stimulus with the first word that came tomind. Associa- 





*See reference 19 in the bibliography for complete description of the free 
association experiment and discussion of the literature. 
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tion times were recorded to the nearest tenth-second by means of a 
stop-watch; although this probably involves a slight constant error, 
as pointed out by Dunlap,’ the procedure is sufficiently accurate for 
present purposes. Symonds” states that those who have fairly 
considered the merits of different procedures have usually preferred 
the stop-watch to more complicated apparatus. The stop-watch 
used in the present experiment is started silently by advancing a slide 
toward the crown, and stopped by depressing the slide away from the 
crown. Pressure upon the stem returns the hand to zero noiselessly. 
The hand makes a complete circuit of the dial in thirty seconds. Of 
several standard timers tried out, this type was found to be the most 
accurate, and by far the most convenient. 

Both the association times and the word-responses from the 
children were recorded. The association responses furnish interesting 
evidence of the emotional stimulus value of the words used in the 
study, but satisfactory methods of analysis have not yet been devised 
to handle such data. The present brief report is limited to considera- 
tion of certain features of the time records. 


TREATMENT OF THE DATA 


The treatment of the data parallels in essential features the analysis 
of the learning data previously reported.?:*:* Since results for indi- 
vidual items are somewhat unreliable, both in the learning and in the 
free association time data, we have been interested in measures based 
upon suitable groups of items. In this treatment, a group of words 
homogeneous in one respect has been regarded as a test, in which the 
individual words are the items. The scores used in the study are 
obtained by simple summation of item-scores.* 

The distributions of association times to words are positively 
skewed. However, the skewness is markedly reduced when the data 
under consideration are combinations of items. The fundamental 
requirement is a discriminatory measurement; for the present purpose, 
this requirement is obviously better met when several items, homo- 
geneous in one respect, are included in the measure. 

The questions with which we are primarily concerned at present 
may be answered through study of distributions of difference scores. 





* This procedure does not imply criticism of other modes of analysis. We 
believe that the study of individual items will be productive, especially when a 
large number of items are accumulated and methods of analysis are worked out 
for study of the items with respect to the several types of data. The application 
of mental-test summation procedures to these data seems advantageous at the 
present time. 
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We wish to know whether association times for the particular words 
used in our learning tests show reliable differences between the pleas- 
ant, indifferent, and unpleasant categories. 

Three difference scores were computed for each child. This was 
done at each stage in the experiment, to show the effect of increasing 
the amount of data used in the analysis. The P — I (read P minus J) 
difference score would be positive if the pleasant words required longer 
association times than the indifferent words. The U — P difference 
score would be positive if the child took longer to respond to the 
unpleasant words than to the pleasant words. Finally, the U — I 
score would be positive if the child took more time in response to 
unpleasant than indifferent words. If there were no general differences 
in association times between the categories, the difference scores would 


approximate zero. Negative scores would indicate differences in the 
opposite directions. 


RESULTS 


The distributions of the difference scores showed wide individual 
variation, scores ranging from fairly high negative to fairly high 
positive values. The distributions of difference scores seemed to be 
approximately normal. We are concerned with the deviations of the 
means from zero. 

Table I presents the mean difference-scores based upon association 
time scores for the three categories of words. The standard errors 
of the means, and the critical ratios, are also given. The critical ratios 
indicate whether the mean differences are reliably greater than their 
standard errors. For discussion of the statistical procedures involved, 
the reader is referred to Kelley. 

Table I shows that the association times are reliably longer for the 
unpleasant words than for words in the other two categories. There 
are no reliable differences in association time between the pleasant and 
indifferent words used in the study. These trends of results are shown 
when each of the three “tests” or categories includes only a few words. 
Additional confidence in the results is justified by the fact that exten- 
sion of the ‘‘tests” or categories brings out the trends with increasing 
clearness. 

The present results are to be considered in relation to the learning 
results reported in an earlier study.4 There the argument rested unon 
the learning data for these same words grouped in the same categories. 
It was shown that the children learned the P words best, the U words 
next best, and the J words most poorly. The differences in learning 
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TaBLeE I.—Megan DIFFERENCES BETWEEN CHILDREN’S ASSOCIATION-TIME SCORES 
ror Groups oF Peasant (P), INDIFFERENT (J), AND Unpieasant (U) 








Worps 
P-I|U-P\|\U-I 
Series I, eight words: 
I tii cecil nth abentien hae dan we ween 1.30 | 4.00 | 5.30 
NE POLE PTT 1.68 | 1.40] 1.28 
Chet hitb be pbb awbe stems de sad een 82 | 2.86) 4.14 
Series I and II, sixteen words: 
SE cc Ub cede deh G eRe CaN CeUd eee ds Veveseawan 1.60 | 8.80 | 10.40 
Standard error Of MOAM..... ccc cccccccccccccccs 1.90 | 1.58] 1.76 
SET, Cr: ee ST .84 | 5.57) 5.91 
First three series, twenty-four words 
thee ah eek te bet eee Ct sto ble one ¢ 4 a témba — .80/ 11.30 | 10.50 
SIP WII sn 5s vnc bc ceschiccccccsecs 2.24] 2.14] 2.08 
a coh duane wSedar wets vb rdbcdeacved .86 | 5.28) 5.05 
First four series, thirty-two words: 
a i Rate oa in eel lk ga ce bb le hed —1.60 | 17.90 | 16.30 
ESE ER RT IT 2.71 3.10 | 2.54 
toast i's ob ada We weud aces ceded Reade .569 | 5.77 | 6.42 
All five series, forty words: 
EE Seis cece Lactose 644000 ON bE eedebees bs ser 4.80 | 15.70 | 20.50 
es MUM DRG. os db donk cbacadecveeeees 2.89 | 2.73 | 3.23 
a le te ela 1.66 | 5.75 | 6.35 














* The mean in each case is based upon data from one hundred children. The 
number of words in each category is as indicated in the table. 


were reliable. The same cumulative treatment of the data showed 
that the differences increased as the body of data was extended series 
by series. 


DISCUSSION 


The foregoing results have a bearing upon a number of problems 
which have arisen in connection with studies of emotional factors in 
learning. 

1. The classification of words on the basis of ratings is sometimes 
challenged on the ground that factors which produce emotional 
responses to words may through emotional disturbance, suppression, 
or repression, reduce the validity of the classification. The substantial 
difference herein reported between the reaction times to unpleasant 
and to other types of words is in accord with results from previous 
clinical and experimental studies which indicate that delayed reaction 
time may be a “‘complex indicator,” that unpleasant words are more 
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likely to produce delay while substitute associations are rehearsed, etc. 
We may conclude that the present study provides evidence which 
supports our subjective classification, at least with regard to the 
demarcation of unpleasant from other words. 

2. It has been suggested that differences in learning words of 
differing emotional significance may be due primarily to differences 
in the familiarity or concreteness of the words. It has been reported 
in several studies’’!*-® that association times tend to be greater for 
words which are difficult or unfamiliar. In the present material, 
marked differences in learning have been found between the J words 
and the P words, not however accompanied by association time differ- 
ences. We may infer that the learning differences found in this series 
of studies are not dependent upon the factor of familiarity. This 
conclusion is also supported by the comparison of the unpleasant and 
indifferent words; these show differences in reaction time which are 
opposite in direction to the differences in learning. Other data leading 
to a rejection of the familiarity theory were reported in an earlier 
study.‘ 

3. Both pleasant and unpleasant words, as classified by ratings, 
arouse emotional response as measured by the galvanometer. This 
general fact has been demonstrated ia a number of studies,?:!!»"® one 
of which deals with a portion of our present data. But the free associa- 
tion data here reported show that the U words elicit a type of emotional 
response not characteristic of the P words. In view of the fact that 
the P words are better learned than the U words, the data suggest 
that the nature of the emotion is an important factor in determining 
its effect upon learning. 

The present data furnish additional support for our assumption 
that two types of emotional tone operate with different effect upon 
the learning. We may state the theory in terms of two hypotheses. 
The first is that words which are emotionally-toned are better learned 
than words which are not emotionally-toned. This effect, which 
may be attributed to intensity of the stimuli, has been demonstrated 
in all our learning results. The second hypothesis is that unpleasant 
emotional tone contains some elements which tend to inhibit learning. 
The inhibition of the unpleasant merely serves to depress the level of 
efficiency below that which one would expect in terms of the intensity 
value of the stimuli. Hence, when the materials used for comparison 
are equal in intensity, the truth of the theory of bidirectional effect 
of emotion upon learning appears. The inhibition of the unpleasant 
can be demonstrated in comparing it with the pleasant. 
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Smith’s theory’ of bidirectional effects has sometimes been regarded 
as inconsistent with the facts, since unpleasant words are not harder 
to learn than indifferent words. The present formulation differs from 
Smith’s only in taking account of the first factor, which we might call 
intensity. Smith’s statement of the theory would lead one to expect 
the order of efficiency in learning to be P, J, U. The present theory 
leads one to expect the P, U, J order. Doubt concerning the factual 
side of the problem was possible so long as no single study provided a 
large accumulation of data; this is true because of the specificity of 
learning results for single words, and the lack of reliability and general 
significance of results based upon too few items. The present series 
of studies has demonstrated consistent and reliable trends in the 
learning, supporting the P, U, I order of learning. When the first 
hypothesis is taken into account, the present data support the conclu- 
sions set forth in the 1929 paper by Jones."! 


SUMMARY AND CONCLUSIONS 


A study of affective factors in learning is being conducted with a 
group of about one hundred children who have been tested repeatedly 
over a period of years. The re-test technique has permitted collection 
of several types of data which could only be obtained in a cumulative 
testing program. Each of these lines of information is essential to 
the study of the problem. 

In a series of reports, it has been shown that words classified as 
pleasant are better learned than words classified as unpleasant, and 
that the latter are better learned than words classified as indifferent. 
These findings are statistically reliable. Analysis of a portion of the 
material shows that both pleasant and unpleasant words as classified by 
ratings produce emotional response as measured by the galvanometer. 

The free association method used in the present report permits one 
to differentiate between pleasant and unpleasant stimuli in an objective 
fashion. Association times are reliably greater for words in the 
U category than for words in the P or I categories. There are no 
reliable association-time differences between the P and I words. 

Association times are known to be longer for words which are 
difficult or unfamiliar. The data therefore are inconsistent with the 
hypothesis that inter-category differences in familiarity or concreteness 
of the words can account for the trends in learning. The accumulation 
of data is consistent with the theory that emotionally-toned material 
is easier to learn than indifferent material, and that unpleasant emo- 
tional responses contain elements which tend to inhibit learning. 
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THE LEARNING RATIO AND ITS APPLICATION* 


JOHN NOBLE WASHBURNE 


Syracuse University 
I. THE BASIC EQUATIONS 


In an earlier article, ‘‘The Definition of Learning,’’” it was argued 
that learning may preferably be defined as ‘‘improvement, through 
experience, of problem solving ability.” It was further argued that 
problem solving consists in the ability of the individual to employ such 
effort and help as may be at his disposal in order to reach a goal in 
spite of difficulties. This ability or insight, may be expressed as a 
ratio between, on the one hand, those internal and external factors 
which constitute the difficulty of the goal (namely, g, the nature or 
complexity of the goal itself; 0, the external obstacles between the 
active agent and the goal; and r, the internal resistances, misconcep- 
tions, interfering habits, and the like, that must be overcome) and, 
on the other hand, those internal and external factors which overcome 
the difficulty of the goal (namely, e, the effort or energy output of the 
agent; m, his memories, facilitating habits and other internal helps; 
and h, external helps such as cues, tools, and the like). This ratio is 
written: 


e(h + m) = g(o +r) (1) 
Learning, or improvement in problem solving ability, it was argued, 
always involves an increase in m or a decrease in r or both. It is, in 
short, equivalent to an increase in the value of m/r. The equivalents 

of m/r in our formula are, we find by simple algebraic transposition: 
m= 9(241)-* (2) 

, Cl ome r 

Written this way the formula covers all the currently widely 
accepted descriptions of learning. Plainly an increase in the value 
of m/r involves any or all of the following things: An increase in m 
which corresponds to learning as elaboration (increase of associations 
or ‘“‘connections’’); a decrease in r, which corresponds to learning 
as simplification (elimination of waste motion or ‘‘telescoping” of 





* The writer wishes to express appreciation for the various helpful criticisms 
and suggestions made by P. M. Symonds in his editorial review of the present 
article, and to Professor I. S. Carroll of the department of Mathematics, Syracuse 
University, for his critical inspection of the mathematical work here presented. 
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behavior); an increase in g, which corresponds to learning as goal 
elaboration; a decrease in e, which corresponds to learning as effort 
reduction; an increase in 0, which corresponds to learning as increased 
ability to overcome obstacles; and a decrease in h, which corresponds 
to learning as ‘‘cue reduction.’’* 

But the formula does more than include and condense these various 
descriptions of learning, it modifies them by indicating a definite 
proportion between the increases and decreases which are involved, 
and it should therefore, even in the absence of exact quantitative values 
for the equated factors, prove useful in interpreting and predicting the 
outcome of learning experiments. The purpose of the present article 
is to show how the formula may be used for such interpretations and 
predictions. 

We may procede by examining two well worked areas of experi- 
mentation; one in which the results all seem to agree, and the other 
in which they seem, in some cases, not to agree. The formula if 
useful should help explain and predict both agreements and disagree- 
ments. The first area is that of experiments in the conditioned reflex 
and the second is that of experiments in part vs. whole methods of 
learning. 

Before analyzing a characteristic experiment in the field of the 
conditioned reflex, it is necessary to differentiate this type of learning 
from trial and error. One or the other of these two types of learning 
(or none at all) takes place whenever there is no spontaneous insight, 


; ;, ee h 
that is, whenever the situation is such that em does not equal 
one. Under such circumstances this ratio must, of course, equal 
either more than one or lessthan one. Giving z a positive value in the 
first case and a negative value in the second, we may say that: 


e(h +m) _ 
go+r) +? “ 
Whether z is positive or negative, it must be reduced to zero in order 
e(h + m) 


to establish the unit of initial achievement where ———~ equals one. 
g(o + 7) 


When z represents a negative value, the numerator is smaller than the 
denominator—that is, the effort, cues, implements, memories, etc. 





* An increase or decrease in any one of these factors by itself may not, as 
pointed out in ‘‘The Definition of Learning,’’!® constitute learning. It does so 
only when the change is equivalent to an increase in the value of m/r. 
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are not equal to the task set by the goal, its obstacles, and the learner’s 
interfering habits. 

This is the situation which is characteristic of trial and error. It 
is the situation in which z can be reduced to zero only by increasing 
e(h + m) until it is equal to g(o +,1).* In other words, it is the 
situation in which initial achievement can be established only by the 
learner increasing his effort (e), looking about for more implements 
and cues (h), and putting his memory, imagination, etc., (m) to work. 

When z represents a positive value in formula No. 3, the numerator 
is larger than the denominator—the external stimuli (h) together with 
the ideas (m) and effort (e) which they arouse go beyond whatever 
goal and sensed resistances may have been present at the moment of 
their occurrence. This is the situation which gives rise to the con- 
ditioned reflex. It is the situation in which z can be reduced to zero 
only by increasing g(o + r) until it is equal to e(h +m). In other 
words, it is the situation in which the change instead of being in the 
cues, effort and memories, is in the goal and its attendant difficulties, 
as when a stimulus arouses or changes a destre. 

In trial and error, the goal and obstacles, g(o + 7), are constant 
and e(h + m) varies; whereas in the conditioned reflex the reverse is 
the case—the cue sequences are constant and the goal varies. 

It may not at once be conceded that the foregoing is a correct 
summary of the relationships involved in the typical situation giving 
rise to a conditioned reflex. This is because the concept of goal has 
been generally excluded from discussions of experiments in this field. 
But the exclusion has been an artificial one. For the experiments 
themselves show, with few if any exceptions, situations so arranged 
as to insure avoidance behavior or seeking behavior—in a word, goal 
behavior—on the part of the animal.t 

What we usually observe in animals in a conditioned reflex experi- 
ment, closely resembles that which we observe in ourselves under 





* It may be argued that similar results could be obtained by reducing g(o + r). 
But insofar as this involves changing the original goal or the obstacles, it would 
change the basic conditions of the task. So far as the reduction of r is concerned, 
it does not appear possible to bring this about directly and voluntarily except 
through an increase in m. 

t To say that the internal tension pattern, or balance of forces, which gives 
rise to purposive behavior which alters the external balance of forces, is in turn 
(reflexively) altered by that which it alters is true. But to say that therefore the 
purposive—or goal—component need not be considered is not true. The causal 
relationship is circular and hence both components must be taken into account. 
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similar circumstances. The animal reacts reflexively to his environ- 
ment in a fairly neutral way—that is, his reactions do not appear to be 
distorted or strongly organized by some internal (goal seeking) con- 
dition. This is followed by a strongly organized or goal behavior. 
The picture is usually not of one non-purposive reflex followed by 
another, but rather of reflexive behavior followed by trial and error 
behavior or established goal behavior. 

The analysis of this behavior, by observers, into stimulus-response 
(S-R) categories is familiar. It is a division somewhat different from 


-that which is made from the point of view of the experiencer in the 


learning formula. But as both descriptions cover the same behavior 
they should have equivalent elements. 

First it is necessary to agree upon an exact meaning of S. The 
broad meaning accepted, perhaps, most generally is that a stimulus 
is an arrangement of external factors adequate to bring about a rear- 
rangement of internal activities. If we refer to the latter as R, we 
may say that S = R. 

‘‘An arrangement of internal activities” at once suggests g/e in 
our formula—effort or energy or tension distributed in a pattern, 
leaving a patterned gap or goal (g). 








If R = then S should equal au For S = R and, in our 
equation, g = a But it may appear that since m and r are 


internal factors they should not be included in a definition of S. This, 
however, is not true, for no external factor purely, as an object, consti- 
tutes a stimulus. Rather, an object is a stimulus only insofar as it is 4 
part of an external pattern of resistances (obstacles) and facilitations 
(o and h) which bears a certain relationship to an internal pattern of 


TaBLeE I.—STIMULUS-RESPONSE EQUIVALENTS 














Symbols Definitions Learning formula 
equivalents 
S Stimulus—the external facilitation-resistance pat- h+m 
tern in relation to the internal sensitivity-resistance o+r 
pattern. 
R Response—The internal activity pattern J 
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resistances and sensitivities (r and m). Hence it is not unreasonable 
define S h+m 
to define 5 as o+r 


II, USE OF EQUATIONS IN CONDITIONED REFLEX EXPERIMENTS 


With these definitions in mind, we may now apply the learning 
equation to the observed facts of the classical conditioned reflex 
experiment in which a dog is placed in an experimental chamber, 
exposed to the sound of a bell and then given meat. 

In this experiment the animal has usually become more or less 
habituated to the experimental chamber before the introduction of 
Si, (the bell). Indeed, Pavlov points out that this or its equivalent 
is an essential preliminary.’ It is necessary that the reaction (2,) to 
the experimental chamber (S,) shall be adequate,* that the animal 
shall be in a neutral or balanced state with respect to the chamber; 
in other words, that R, shall equal S,. In this case, of course 


S. 
R. 


and substituting the Learning Formula Equivalents in Table I, we find 
that formula No. 4 is the same as formula No. 1. 

When the animal has reached this condition of relative equilibrium, 
any addition to S, without a corresponding addition to R, will change 
the equation to 


= 1 (4) 


Roit+s (5)t 

The bell constitutes such an addition. The stimulus is increased, 
thereby upsetting the equilibrium between the stimulus (or cue- 
memory) pattern and the goal-effort (g/e) response of the animal. 
Formula No. 5 is therefore what we start with in the typical condi- 
tioned reflex experiment. 

Substituting the learning formula equivalents, and transposing 
so that stimulus and response factors fall on opposite sides of the 
equation this formula becomes 


h+m 
o+r 





-~74 9 
ee (6) 





* That it does not arouse, or no longer arouses, the dog to investigate or escape. 
t Thesame as equation No. 3 when learning formula equivalents are substituted. 
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The additional goal factor here (=) might be experienced by a 


human being as an enquiring state of mind. Pavlov deduces the same 
state of mind from the actions of his experimental animals and calls 
it an “investigatory reflex.’’ 

However that may be, we shall refer to the additional bell-response, 


g 


a» 8S R,, to the general response to the experimental chamber, g/e, as 


R., and to the augmented stimulus nam (the experimental chamber 


plus the bell) as S;. Thus formula No. 6 becomes 
S; —= R, + R, (7) 


It should be noted that the animal’s investigatory reactions have 
been satisfied in the case of the experimental chamber and not in the 
case of the bell. Consequently the bell affects, and combines with, 
any stimulus which may occur while it is going on, thereby giving rise 
to another reaction. This we may express as follows: 


Ri +S: = R, (8) 


Now if we combine these two stimulus-response situations as they 
are combined in the experiments, that is, if we add our two equations, 
we shall expect to obtain an expression of the results obtained in the 
conditioned reflex. 

This we may do by stating the fact that the sum of the left-hand 
sides of equations No. 7 and No. 8 is equal to the sum of the right-hand 
sides, that is, S; + Ri + S2 = R, + Ri + Re, which reduces to 


S, = R, + R2 — S82 (9) 


Formula No. 9 expresses the fact that S, (the bell and the experi- 
mental chamber) gives rise not only to appropriate responses to the 
experimental chamber (R,) but also to digestive reactions to meat* 
(R2) even though the equilibrating stimulus for these reactions (S2) is 
absent. 

Whether or not the conditioning process represents learning as 
defined by the formula, namely, any equivalent of an increased value 
of m/r, depends upon whether or not #2 is greater than R;. For, as 
will be seen from formula No. 2, an increase in g/e (without an equiva- 
lent increase in h) corresponds to an increase in m/r. Since S; contains 





* Actually a partial anticipatory response, which constitutes a desire. 
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the same value of h in both formulas 7 and 9, and since in the first 
case it is equivalent to R, + R:, and in the second it is equivalent to 
R, + Re, any superiority in the value R, over R; means an increase in 
g/e without a corresponding increase inh. From formula No. 8 it is 
clear that Rz is greater than R,* therefore learning has taken place. 

It appears, then, that our definition is adequate to cover this type 
of learning. Furthermore, the definitive formula may be used to 
help analyze, set up, and predict the outcome of many different experi- 
mental situations. In the case of the conditioned reflex, when due 
care has been taken that S and R are properly constitutedf according 
to the equivalents in Table I, these symbols may be used to predict 
the outcome of all sorts of experimental variations. 

For example, suppose that 


S,+S. = Ri + R: (10) 
and that 
Rit+Rk.+ S83 = Rs; (11) 


It follows that the sum of the first half of these equations is equal 
to the sum of the second half, and when we thus combine them and 
cancel out those factors which occur on both sides of the resultant 
equation, we get 


Si+8:+8S; = R; (12) 


From this it follows that any one of the three Stimuli (S,, Se, Ss) 
will equal R; minus the other two. 

Pavlov’ (p. 142) describes an experiment that, so far as may be 
gathered from the text, parallels this suppositious situation, and the 
experimental outcome accords with the theoretical one. He says 
“In one experiment there were used as components in a stimulatory 
compound two different tones, which appeared to the human ear to be 
of equal intensity. When the conditioned reflex to the compound 
became fully established, the tones sounded separately were found to 
produce an equal effect.” 





* The writer has pointed out elsewhere® that this relationship is according to 
the “‘Principle of Intensity,”’ a necessary factor in conditioning in the sense of 
arousing anticipatory adjustments. 

t Care must be taken to analyze the situation correctly in order to know when a 
series begins. For whenever an equilibrium is reached (that is, whenever S/R = 1 
instead of 1 + z) whatever follows should be included in a new equation. 
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In this case the two tones, S; and S: are supposed to give rise to 
their two responses, R, and R, (hearing the tones, paying attention, 
pricking up ears or what not), as in formula No. 10. While these 
responses are going on, the unconditioned stimulus, Ss, (e.g. food) is 
administered (so we may assume although no mention is made of it in 
Pavlov’s brief description) and this gives rise to Rs, the unconditioned 
response, as in formula No. 11. When the response to the combined 
stimulus is established* (that is, when S; + S: = Rs — S3) it follows 
that no matter whether S;, S2, and S; occur together or separately, 
&; will always appear on the other side of the equation. In other 
words, as Pavlov puts it, ‘‘the tones separately sounded produce an 
equal effect.”’ 

A search of the experimental literature would undoubtedly reveal 
experiments paralleling many other suppositious combinations of S 
and k. There may be, for example, experimental set-ups which meet 
the requirements of the following propositions: S; = R,, S: = Re, 
Si+ S: = Ri +R: Insuch a case two stimuli occurring simultane- 
ously give rise to two responses having approximately equal z com- 
ponents—not twa in which, as in formulas No. 6 and 7, one R equals 


g and the other R equals zf. In other words, the two stimuli give 


rise to responses which are satisfying or annoying to about the same 
degree. 

The result of such an experimental combination of S;, S:and R,, Rez 
should be that either stimulus should give rise to its own response and 
a “‘partial anticipatory response’’f to the other stimulus, that is, 
S:i=Ri+R.—S:. and S:=R2:+R,—S;. For example (an 
example chosen not because it has any experimental precision, but 
because it has certain illustrative value) if cigarettes and cocktails 





* The repetition required to establish a conditioned reflex probably has to do 
with the reduction of z and whatever process goes on in the adding and transposing 
of the equation. The mathematical handling of symbols can only indicate reason- 
able outcomes and cannot, of course, be substituted for a description of processes. 

It has been pointed out elsewhere’® that the learning formula closely parallels 
the formula for an electric current; and the physical processes (involving the com- 
pletion of galvanic currents that constitutes learning) have heen described by the 
writer in ‘‘An Electro Chemical Theory of Learning.’’* The article further 
discusses the significance of repetition, and the physical basis of the conditioned 
reflex. 

+ For a discussion of the significance of partial anticipatory responses see 
reference 9, pp. 709-710. 
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were never indulged in except in conjunction with each other then the 
partaking of either should arouse the desire for (“‘ partial anticipatory 
response” to) the other. 

The foregoing exemplifies how the learning formula may be used 
in the field of the conditioned reflex. Let us now turn to the field of 
part vs. whole learning. 


III, APPLICATION OF THE LEARNING RATIO TO EXPERIMENTS IN 
PART 08. WHOLE LEARNING 


Until means have been devised for determining with some precision 
the quantitative values to be assigned to the various factors in the 
definitive formula* we may either procede as we have in the discussion 
of the conditioned reflex, or we may assign more or less arbitrary values 
to these factors, in order to indicate approximate equality, superiority, 
or inferiority. Since one procedure has already been emphasized, we 
may now, by way of example, use the other. 

Let us suppose that we have a poem of three verses of about equal 
length and difficulty. It so happens that the first verse by itself 
makes something of a logical unit (say a single sentence) and the next 
two verses likewise can conveniently be treated asa unit. Therefore 
for the part method of learning we divide the poem into (1) the first 
verse, and (2) the next two verses. For the whole-method (3) we 
treat all three verses as a unit to begin with. 

In such a case the complexity (or difficulty) of the task (or goal) in 
(2) is twice as great as in (1), and, in (3), it is three times as great. 
So if we arbitrarily assign a value of 20 to g(o +r) when the task is 
to recite one verse, the value becomes 40 and 60 respectively (as in 
Table II) when the task is to recite two and three verses. Likewise 
with e(h + m), if the effort, help, and memory necessary to overcome 
the difficulty of the task in (1) is 20, then it must be 40 and 60 respec- 
tively in (2) and (3). 





* Assigning quantitative values to the objective factors o and A should not 
prove to be very difficult, and this having been done, values for r may be derived 
by holding e at a maximum and measuring the time of performance—or by some 
such indirect method. It is even possible that e and r may be more or less directly 
evaluated by some measurement of combined metabolic and galvanic responses.’ 
Every form of objective measure should, of course, help in the assignment of stable 
values. But insofar as any of the factors can be expressed quantitatively and the 
others varied or held constant at will, quantitative values may be derived according 
to the formula. 
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TaBLE II.—AssiGNED QUANTITATIVE EQUIVALENTS 














Before memorizing After memorizing 
Factors in 
learning | Study unit, | St¥dy unit, | Study unit, | 5.4. ai, | Study unit, | 5.0 FO 
first Naat second and | whole poem A 7 * | second and hole 
third verses | three verses ret verse | third verses | ” — 
(1) (2) (3) (1) (2) (3) 
g(o+r) 20 40 60 20 40 60 
e(h-+m) 20 40 60 20 40 60 
Q 4 s 12 4 s 12 
o 0 0 0 0 0 0 
r 5 5 5 5 5 5 
6 5 5 5 5 5 5 
h 4 8 12 0 0 0 
m 0 0 0 4 8 12 
e(h+m) = 5(4+0) = 5(8+-0) = 5(12+0) on 5(0+-4) or 5(0+-8) “ae 5(0+-12) ~ 
g(o+r) 4(0+5) 8(0+5) 12(0+-5) 4(0+5) 8(0+5) 12(0 +5) 
ch_,| 5X4, | 5x8_,| 5x12_, 
or 4x5 8x5 125 
em 5x4_, 5X8_,] 5x12_, 
ET Te CT ee) Meera: aimee 7x5 XE 2x5 























Let us suppose now that, to begin with, the memory of this par- 
ticular verse is nil, (m = 0), and that the help (the printed words) 
varies directly with the difficulty of the task (the more difficulty the 
more help), then the value of e in e(h + m) may be held relatively 
constant. If we assign a value of 5 to e, we get as a result the values of 
h shown in Table II. 

External obstacles may also be considered as nil, (o = 0), and 
as r varies directly with e in this instance, its value will also be constant. 
Then, if we assign a value of 5 to r, g will have the values shown in 
Table IT. 

During the process of memorizing (if e is held constant), h may be 
decreased to the precise amount that m is increased, and when h (the 
printed words) can first be dispensed with altogether, m has exactly 
the same value that h had to begin with, and h equals zero. In other 
words, under the conditions outlined, our formula at the first reading 
of the poem, or verse, is 


eh 
—-i 13 
(13) 


and when it has been committed to memory it is 


— «oi (14) 
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Accordingly when units (1), (2), and (3) have been committed to 
memory, we have the numerical equivalents appearing in the last line 
of Table II. 

Moreover, in all three instances learning, as our formula defines it, 
has taken place, for in (1), m/r has increased from 0 to 4, in (2), it has 
increased from 0 to &, and in (3), it has increased from 0 to 13. 
Stated this way it looks as though the answer to the part-whole prob- 
lem were that the two methods are equivalent—that if unit (1) is 
memorized and unit (2) is memorized, and if then they are com- 
bined the result will be the same as if unit (3) is memorized—for, 
46 + 56 = 136. 

But such a conclusion is the result of ignoring some of the factors 
involved in learning—one of the things the definitive formula is 
designed to prevent. The combination of (1) and (2) is not, according 
™\ Me é1 C2 
+ % but also + rat 
(e; + €2)(m; + me), 
(gi + g2)(r71 + 12) ’ 
tuting our numerical equivalents in the three right-hand columns of 
Table II, we have e; + e2 = 10, m, + mz = 12, gi + ge = 12, and 
ea . rs = 1 (1 being the final unit or 
whole poem), and in this equation the value of m/r is 1249. Whereas 


in study unit (3) our corresponding equation (the final one in Table IT) 


. 5X12 _ , : m. 12 pk 
iS To x5 1, and in this the value of > B= In brief, learning in 


the whole method is, according to our definition, superior to learning 
in the part method. The difference, however, is not in the amount 
of m but in the amount of r. We may argue, then, that if effort is the 
same and final achievement (g) is the same, time must vary with 
resistance; in other words, that under the circumstances described 
part learning will take longer than whole learning. This is, of course, 
subject to experimental proof. 

The evidence so far at hand is, superficially at least, contradictory. 
But in those cases where the conditions of the experiment approach 
those we have proposed, where the whole is not too confusing to be 


to the conditions we have described, only 


in other words, the combination is 





and, substi- 


Tr, + 1r: = 10, which gives us 





* From formula 14. Compare also with the last three formulae in Table II. 
If the first two of these were simply added their sum would equal two, whereas to 
accord with the conditions of the experiment their sums must equal one. This 
results from combining the factors separately and maintaining the original equation 
as above. 
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experienced by the learner as a whole, or unit, and where the parts are— 
as in a poem*’* or a piece of music'—sufficiently complete to be experi- 
enced also as wholes, the facts are in accordance with prediction based 
upon our formula. 

There should be no contradiction in the outcome of experiments 
in part vs. whole learning, if the conditions of the experiment are 
arranged carefully in accordance with the learning ratio. One of the 
most important conditions of the experiment outlined in Table II is 
that the parts of the whole task are treated as units—that is, that the 
goal of each part has a definite value (logical coherence and complexity) 
in the mind of the learner and constitutes an end in view, as in the 


whole-method the total poem constitutes an end in view. It is because 
of this that aaa = 1 in methods (1) and (2) instead of 1+ 2 
(x having a negative value). Where these conditions have been 
approached, the contradictions in experimental outcomes have tended 
to disappear. 

Further experiments should be devised (or the learning ratio 
applied to those already to be found in psychological literature) in 
which the part tasks and achievements bave a value in the learner’s 
mind of 1 + 2, the relative size of z in the different situations being 
susceptible to estimation. In general the outcome of such experiments 
should prove that the greater the negative value of z, the more effi- 
cient the part learning. For example, if g should be given a value 
of 12 in all of the last three equations in Table II, then in the first of 
these, since em/gr = 1 — z, 5 X 4 = 12 X 5(1 — 34) and in the second 
5 X 8 = 12 X 5(1— 4%). The sum of these two expressions gives an 
equation which is the same as the final one in Table II. This identity 
has been brought about by reducing the value of r in the two part- 
tasks so that their sum is equal to the value of r in the whole task. 
From this fact, we may further conclude and predict that the part and 
whole methods will be found to approach equivalence to the extent 
that the learner is aware of just what proportion of the whole each part 
represents. 

Enougb examples have been given to indicate the applicability 
of the learning equation to experimental and educational problems, 
and to show that its use is not entirely dependent upon great precision 
in assigning numerical equivalents for its factors. Finally, it has been 
demonstrated that the equation is, as all scientific formulations must 
be,? susceptible to the proof of prediction and objective verification. 


\ 


10. 
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TRANSFER OF TRAINING IN REASONING 


M. C. BARLOW 
University of Utah 


By transfer is meant the learning influence of one training activity 
on a different activity. A study of learning methods gives promise 
of yielding significant results in support of the theory that transfer is 
a major factor of learning among the higher mental abilities. In the 
present study, data are put forth to substantiate the above mentioned 
theory. Most previous studies on the subject are open to the objection 
that transfer effects probably resulted from the combined effect of 
similarity of test material and similarity of method in responding to 
the subject-matter. By method is meant the more or less generalized 
responses exclusive of identical responses carried over from specialized 
training to the end test.- It is well known that the factors of method 
and content must be controlled separately before valid conclusions 
may be made as to the amount of transfer which takes place. 

Headway has been made showing transfer among the complex 
mental functions. In the first statistical study on this topic! the 
subjects who received special training in method gained on the average 
31.6 per cent more than the control subjects. For control and experi- 
mental groups, the two end tests included memorizing poetry, prose, 
Turkish-English vocabulary, dates, and the span of comprehension 
for consonants. The interpolated practice lessons included memoriz- 
ing poetry verbatim and learning columns of paired nonsense syllables. 
While the transfer gains are reliable, it cannot be certain that they 
are not the result of a carry over of similar content. 

In arithmetic, transference of method? has been obtained when the 
end tests and the training lessons were similar. Group A, having 
practice only in different types of two-place addition, gained forty- 
six per cent in similar two- and three-place additions and subtractions. 
Group B, having practice and instruction in generalizing, improved 
67.6 per cent. Group C, having practice and analysis of principles 
involved, gained 53.8 per cent. Group D, from practice, analysis, and 
generalizing, made a gross gain of 63.8 per cent. In the foregoing 





1 Woodrow, Herbert: ‘‘The effect of type of training upon transference,” 
Journal of Educational Psychology, Vol. XVIII, 1927, pp. 159-172. 
Overman, J. R.: ‘The effect of method of instruction on transfer of training 
in arithmetic.’”’ Elementary School Journal, Vol. XXXI, 1930-1931, pp. 183-190. 
122 


So b& 


ee 


@® © 4, BB oc 


LE EE. ea lll 





Transfer of Training in Reasoning 123 


results, it is impossible to tell how much transfer was due to similarity 
of content and how much to method. 

Transfer effects' have been obtained in reasoning from one type of 
subject-matter to another. One group of experimental subjects made 
a residual transfer gain of one hundred thirty per cent in solving the 
following type of logical problems: ‘‘There is a large lake in a forest, 
and the forest is in a country called Finland. Which is the larger, the 
lake or the whole country of Finland?’”’ The experimental group were 
trained over a period of ten weeks. 


Special training in analysis and formulation of definitions gave a.~ 


measurable amount of transfer.2, The end tests were merely definitions 
of ordinary terms. The training S’s were given three lessons of five 
short experiments dealing with magnetism, and requiring from five to 
ten minuteseach. They were followed by illustrations and discussions 
as to the requirements of a good definition. Both groups wrote 
definitions about the experiments. The result was a gain of 9.2 per 
cent by the experimental group and a loss of seven per cent by the 
control group. 

The problem of the present study is to ascertain the transfer effect 
on one kind of material by training in reasoning in material of an 
utterly different kind. One hundred and thirty-five subjects took 
part. There were seventy-six twelve- and thirteen-year-old pupils 
from the seventh and eighth grades. I employed two groups of 
elementary S’s of thirty-eight each. The two groups were paired in 
regard to their IQ’s. In CA their range was fifteen months. The 
mean IQ of the elementary experimental S’s was 107.23; the sigma 
of their 1Q’s was 6.70, and the average mental age, 169.71 months. 
The average IQ of the elementary control S’s was 106.76; the sigma 
of their [Q’s was 6.59 and the average mental age, 163.97 months. A 
total of fifty-nine adult cases are reported. They were paired in 
two groups according to O. S. U. Psychological Centile scores. The 
average centile score of the thirty-one experimental S’s was 61.35, 
and the average centile for the twenty-eight control S’s was 61.20. 
In the experimental group twenty-three of the thirty-one subjects 
were in the centiles fifty to one hundred, while eight were below the 
fiftieth centile. Of the twenty-eight control subjects twenty-one were 





1 Winch, W. H.: “‘The transfer of improvement in reasoning.” British Journal 
of Psychology, Vol. XIII, 1923, pp. 370-380. 

* Meredith, G. P.: ‘Consciousness of method as a means of transfer of training.” 
The Forum of Education, Vol. V, 1927, pp. 37-45. 
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above the fiftieth centile and seven were below the fiftieth centile. 
The rho was .96; the r was calculated between the corresponding 
numbers within the various deciles. 

The four groups took the initial test and the end test, both consist- 
ing of fifteen of Aesop’s Fables. The task in both tests was to write 
the lesson conveyed by each fable. The experimental group were 
given twelve carefully planned lessons treating several phases of 
simple analysis, abstraction, and generalization. Each lesson lasted 
twenty minutes. One year later the S’s of the seventh and eighth 
‘grades were re-tested. : 

The items of the initial and final tests were paralleled in regard to 
difficulty. The PE difficulty units corresponding to the different items 
were calculated from answers secured from ninety-one elementary- 
school pupils. The per cent of pupils answering each question cor- 
rectly was converted to PE units from a table such as Garrett’s table 
of the normal curve expressing percentages in PE units. In Table I 
are the names of the fables and their PE difficulty units. 


Taste I.—Fasies Comprisinc Enp Trests IN TRANSFER OF REASONING FROM 
One Kinp or TRAINING MATERIAL TO ANOTHER, SHOWING NAMES OF 
FABLES AND DirFicuLTy In TeRMs oF PE Units 








Differ- Differ- 

ence ence 
The Widow and the Hen..... 4.98 |The Ant and the Grasshopper.| 5.10 
The Crab and Her Mother....| 5.27 {The Thirsty Pigeon......... 5.52 
The Angler and the Little Fish.| 5.88 |The Lionéss................ 5.93 
The Bundle of Sticks......... ee Se nos one cnceeeee ed 6.29 
The One-Eyed Doe.......... 6.26 |The Fox and the Mask...... 6.38 
The Creeking Wheels......... 6.38 | The Jackass in Office........ 6.44 


The Ass and the Grasshopper.| 6.44 |The Goat and the Goatherd..| 6.58 
The Trumpeter Taken Prisoner} 6.58 |The Countryman and the 


The Lion and the Bulls....... 6.74 eh tel sie 6.66 
The Swallow and the Raven...| 6.81 |The Great and the Little 
I ccc atascccccca 6.90 DG tu oS st sob cwers ee oy 6.81 
The Old Woman and Her BP bc cece cctces 6.81 
EE ae) 5 er a 7.01 |The Mountain in Labor...... 6.90 
The Wolf and the Shepherds. 7.23 |The Stag and the Pool....... 7.01 
The Gnat and the Bull....... 7.37 | The Fir Tree and the Bramble.} 7.37 
The Lion and His Three Coun-| 7.37 |The Fox and the Lion....... 7.37 
NE a ih ie a ae 7.73 |The Wolf and the Horse..... 7.99 











RSS eee rere 6.48 6.58 
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An answer was scored as correct if it gave a direct statement of the 
lesson involved, or an unmistakable suggestion of the right answer. 
For example, consider the fable of ‘‘The Widow and the Hen.” It 
goes as follows: 

‘“‘A widow kept a hen that laid an egg every morning. Thought 
the woman to herself, ‘If I double my hen’s allowance of barley, she 
will lay twice a day.’ So she tried her plan, and the hen became so 
fat and sleek, that she left off laying at all.” 

The best answer is: ‘‘ Figures are not always facts.’”’ The following 
other answers were also counted as right: ‘‘ Don’t be greedy, let good 
enough alone’’; ‘‘Greed does not always pay”’; ‘‘Be content with what 
you have’’; and ‘‘ Don’t count your chickens before they are hatched.”’ 

The following and other similar answers are wrong: ‘‘To be selfish 
is to be useful’; ‘‘Don’t be anxious”; ‘“‘Be Patient”; ‘‘Keep what 
you get and try for no more”’; and, ‘“‘One hen can’t do the work of 
two.” All papers were scored independently by two persons with 
results that were in almost exact agreement. 

The interpolated training of the experimental S’s included twelve 
lessons of twenty minutes each, given to instruction and practice in 
reasoning. Four lessons were on analogies of the following form: 
Prince is to Princess as King is to The pupils completed 
exercises in supplying one, two, and three missing terms. Analogies 
given in every lesson were written on the board or mimeographed. 
The material was also read to the pupils. The S’s read their answers 
and gave the steps by which they arrived at the results. They wrote 
TaBLe II.—AveracGe NumBer oF Correct ANSWERS TO FaBLEs IN Two END 

Tests BY EXPERIMENTAL AND CONTROL ELEMENTARY-SCHOOL AND COLLEGE 


Suspsects SHOWING TRANSFERENCE OF REASONING IN OnE KIND oF 
MATERIAL TO REASONING IN ANOTHER 











Exper | Control | “P*™ | control Experi- | Control 
mental mental 
elemen- adults mental total 

elemen- tary 8’s adults S's 8's S’s 

tary S’s wad S’s 
WE sceccacs 3.28 3.16 7.93 8.07 5.35 5.17 
WOE bcs ckes 5.35 3.13 9.81 8.66 7.38 5.45 
Gross gain...... 2.07 — .03 1.88 .59 2.03 .29 
Gain per cent...| 64.03 — .09 23.70 7.31 37.96 5.59 
Dei bia e cen cuit 38.00 38.00 31.00 28.00 69.00 66.00 
tape 8 tS See Se eh ewe 1.74 
Percenttransfer|; 64.03 | ..... ae B steer 32.52 
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out their steps of reasoning. In like manner, four lessons were given 
on the analysis and practice with abstractions and generalizations being 
made from the concrete to the general and from the general to the 
concrete. Four reading lessons were given emphasizing comprehen- 
sion and analysis of behavior situations. Attention was called to the 
best answers. Pupils wrote the mental steps by which they arrived 
at the answers. Tabular results appear in Table II. The experi- 
mental group of elementary S’s gained 64.03 per cent as a result of 
being trained in reasoning during the interval between the end tests. 


‘The corresponding control group lost 0.09 per cent. The experimental 


adult group gained 23.70 per cent. The control adults gained 7.31 per 
cent. The transfer was 16.27 per cent. Combining the two experi- 
mental groups, the average gain was 37.96 per cent. Combined, 
the control groups gained 5.59 per cent. The residual gain was 
32.52 per cent. 

The number of sigma difference units in the difference of the mean 
scores of the initial and the final tests is 2.78 for the experimental 
groups combined. The control groups made no significant gains as a 
result of special practice. 

The sigmas and correlation coefficients are put forth for experi- 
mental and control S’s in the initial and final tests. The sigma of the 
elementary experimental group scores in the initial test is 2.19; in 
the final test it is 2.68; the r of the same scores is 0.65. The sigma 
of the adult experimental scores in the initial test is 2.75; in the final 
test it is 2.96; the r of the gross scores is 0.54. The sigma of the 
elementary control group scores in the initial test is 1.62; in the final 
test it is 1.67; the r is 0.57. The sigma of the adult scores in the 
initial test is 2.76; in the final test it is 2.82; the r is 0.65. 

The more intelligent seventh- and eighth-grade pupils made greater 
transfer gain than the less intelligent, while the more intelligent adults 


-made less gain than the less intelligent adults. The seventh- and 


eighth-grade S’s of the upper fifty per cent, according to intelligence 
test scores, made a gross gain thirty per cent greater than that of the 
lower half. Similarly, the upper fifty per cent, according to the initial 
reasoning test scores, gained eleven per cent more by transference 
than the lower half. On the other hand, the upper half of the adults, 
selected according to intelligence test scores, gained thirty-seven per 
cent less than the lower half. Similarly, the upper fifty per cent, 
selected according to the initial reasoning test scores, gained sixty- 
two per cent less than the lower half. 
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During the year after the training, the control S’s tended to show 
a larger non-practice gain than the experimental S’s. The retention 
test was given the elementary subjects a year after the training, care 
being taken to give each pupil the same test which he took at the close 
of the training series. The mean score of twenty-five cases of the 
experimental group was 5.72, which represents a non-practice gain of 
.37 of apoint. The final test scores at the close of the training lessons, 
by the pupils who did not take the re-test, averaged 4.7, which is 
slightly less than the group average of 5.35. The mean of the twenty- 
seven who made up the control for the re-test group was 5.81, which 
represents a non-practice gain of 2.68 in a year’s time. The average 
of the final test by the control S’s who did not take the re-test was 
4.00 which is slightly greater than the group average of 3.13. 


DISCUSSION 


In all the above mentioned experiments where training was devoted | 


to instructions on abstracting, analyzing, and generalizing, a relatively 


large amount of transfer resulted. When the final test material | 


differed entirely from the training matter, the transfer amounts were 
as great as when the end tests resembled the interpolated content. 
Transfer would seem to depend on the analysis of the mental steps 
involved in learning; on descriptions of various aspects of the problems 
under consideration; on comparisons in regard to similar problems; 
methods and practice in definitions; on proceeding from the concrete 
to the general and visa versa with respect to reasoning in the training 
material, and on intentional efforts to apply learning methods. 


Several factors lend support to the theory that general transfer 


takes place in the form of the learning curve. In the present study the 
upper half of the seventh- and eighth-grade S’s, using mental test 
scores as a criterion, made larger transfer gains than the lower half. 


’ The lower half of the adults made greater gain than the upper half. 


The experimental pupils of the elementary grades during the year after 
the study maintained their gains, while the control S’s improved only 


about as much as the experimental S’s gained as a result of two weeks _ 
of special training. The lack of measurable growth among the experi- | 
mental S’s during the year following the training may be due to | 


the combined influence of forgetting and maturation, the one tend- 


ing to decrease and the other tending to increase reasoning efficiency. | 


Although transfer effects as found in studies of record have not appeared 
to take place in the form of the learning function, if our theory is cor- 
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rect, they may be shown to take place in such a way by the use of small 
measuring units. Experiments are now under way by the writer to 
test this assumption. 

If it turns out that transfer conforms to the curve of learning, it 
would not appear to be contrary to existing theories. Transfer as a 
result of similarities between test material and the training material 
is to be accounted in terms of increasing complexity of the learning 
activities. Transfer from the viewpoint of generalizations may be 
thought of as integrated activities in which the way of doing the 
interpolated performances is applied to the performance of the end 
tests. 

The organization theory, meaning that transfer takes place in the 
degree to which activities learned during practice become organized 
components of the end test responses, implies learning to be a unitary 
-experience which grows more complex with practice. It would seem 
‘reasonable to suppose that progress is much the same in generalized as 
in specialized learning, that is, according to the negatively accelerated 
‘learning curve. 





THE EFFICIENCY OF CERTAIN INTELLIGENCE 
TESTS IN PREDICTING SCHOLARSHIP SCORES! 


DOROTHY C. ADKINS 
The Mooseheart Laboratory for Child Research 


A program involving the administration of numerous tests was 
undertaken at Mooseheart in the school year 1930-1931. Among 
these tests were three designed primarily as tests of intelligence—the 
Kuhlmann-Anderson, the Morgan Mental, and the Otis Intelligence 
tests, which were administered to the high-school students each year 
from 1930 to 1933; in 1934, only the Kuhlmann-Anderson Intelligence 
test was given. 

Both in connection with the school survey itself and in connection 
with a broader longitudinal study of mental growth which is proposed 
at Mooseheart, it is desirable that the intelligence tests used be highly 
discriminating. For no matter how reliable a test may be, without 
substantial diagnostic power, it is useless. The available scores 
provided an opportunity to investigate the validity of the tests, which 
is the matter under consideration in the present paper, and the effects 
of practice on intelligence test scores, which will be reported in a 
subsequent paper.? 

As has often been noted, in ascertaining the worth of a test it is of 
‘value to determine its relationship to an independent criterion, where 
such a course is practically feasible, rather than to correlate it with 
another test of supposedly the same function. This is true because 
the latter course may yield merely consistency in the measurement of 
whatever function is being measured and because the worth of the 
other test used as a criterion may present a decided limitation. 

The tests of Kuhlmann and Anderson were constructed with 
chronological age as the sole formal criterion. While the makers of 
these tests recognize that factors other than intelligence vary with 
chronological age, they believe that they overcome this difficulty by 
depending on ‘‘common sense analysis in the selection of tests and on 
what was already known about the effect of training on certain types 





’ Acknowledgment is gratefully made to Dr. Martin L. Reymert, Director 
of the Mooseheart Laboratory for Child Research, who suggested this study and 
offered helpful advice. 

? Adkins, Dorothy C.: ‘‘The Effects of Practice on Intelligence Test Scores.” 
To appear shortly in this Journal. 
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of tests.”' However, since there does exist this decided possibility 
that the resulting test may discriminate perfectly among age groups 
and yet not correlate highly with intelligence, it would appear desirable 
to employ some independent criterion other than, or in addition to, 
chronological age, where one is accessible. ; 

No other independent criterion being immediately available for 
this study, one was devised from scholarship scores. While it is true 


that factors other than intelligence enter into the determination of 


school marks, it is pretty widely recognized that success in academic 


‘subjects does require a considerable amount of intelligence. More- 


over, from a pragmatic standpoint, the use of scholarship provides us 
with tests maximally useful in predicting school achievement, a pur- 
pose to which intelligence tests are frequently adapted. 

After a survey of the current marking system in the high school, 
three different indices of scholarship were considered. In a given 
subject, an individual earns a number of credit points, based on a 
Performance Record, an Achievement Record, and a Special Merit 
Record. For each school subject, these credit points are distributed, 
and letter grades A, B, C, D, and X (no credit) are assigned on the 
basis of the normal law of error. The three indices deemed worthy of 
trial are as follows: 


1. The Point-Subject-Ratio (P.S.R.). 

Letter grades were transmuted by an arbitrary code (A = 4 points, B = 3 
points, and so on), and the total number of points was divided by the total 
number of accredited academic subjects for the year for each person. 

2. The Total Credit Points (T.C.P.). 

This index was found by simply adding the total number of credit points 
earned by a person in all academic subjects for the year. 

3. The Credit-Subject-Ratio (C.S.R.). 

This index is analogous to the first, except that the total number of credit 
points (T.C.P.) was divided by the total number of academic subjects for the 
year. 


A preliminary investigation was made, involving Pearson correla- 
tion coefficients of the three tests with each of the three indices of 
scholarship for seventy-seven students who were seventh-graders in 
1930-1931 and who progressed regularly in school through 1933-1934. 
Since the present marking system was introduced in 1931-1932, the 





1 Kuhlmann, F. and Anderson, Rose G.: Instruction Manual. Revised Edition, 
1933, p. 6. 
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test scores of that year are the earliest ones used in investigating the 
predictive efficiency of the tests. For this phase of the study, only 
the 1931-1932 data were involved, since the results, which are pre- 
sented along with others in Table I, showed rather slight differences 


between the indices. 


Insofar as predictability is concerned, the small 


TasLe I.—Vauipity COEFFICIENTS AND INTER-CORRELATION COEFFICIENTS FOR 
THE VARIABLES INDICATED 
(The First-row Coefficients Are Based on Data of the Eighth Grade, 1931-1932, 
the Second on Data of the Ninth Grade, 1932-1933, and the Third on Data 
of the Tenth Grade, 1933-1934) 





























Total | Credit-| Point- 
: : Kuhlmann- | Morgan : 
credit | subject- | subject- Otis 
: , ‘ Anderson | mental 
points ratio ratio 
Total credit points..| ...... . 55 .61 . 53 
= ae — . 64 . 55 .73 
Pieaiat .97 .96 . 62 tice Metis Ga abe 
Credit-subject-ratio.| ...... . 56 .61 51 
97 95 HN LS GORY KORN 
Point-subject-ratio .| ...... . 59 . 59 .50 
96 | .98 Wbiaiebd, gieiilerng elias 
Kuhlmann-Anderson 55 . 56 . SS eae 47 .43 
5 AME) Rea Say ee .60 . 69 
[SS ae Wee ek ere Sa ee 
Morgan mental... .. .61 .61 . 59 ae) i exe . 59 
.55 . ae Geen . 53 
a . 53 51 .50 43 Ae Eh kacaus 
.73 69 i * aes 
Ee ons Sade 286 . 68 92.06 74.40 | 115.38 
245.44 93.43 85.38 | 135.05 
212.97 ee 0 ckies BW keekes 
Sigma.............| 68.32 16.19 14.92 | 18.61 
63.40 19.19 15.90 | 20.08 
112.34 | A eres eye 
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differences present tended to favor Index 2 (Total Credit Points), 
which is also the simplest to compute. In addition, the (Pearson) 
intercorrelation coefficients between the three indices, which were 
computed for the tenth-grade data, 1933-1934, were very high, ranging 
from .95 to .97. Hence Index 2, obtainable by simply adding the 
credit points earned by an individual in all academic subjects for the 
year in question, was used as the criterion in the subsequent investiga- 
tion of the tests. 


TasB_p II].—ComBINaTION AND MvuutTIPpLE CoRRELATION COEFFICIENTS, OF ALL 
COMBINATIONS OF THE TESTS WITH SCHOLARSHIP (T.C.P.), TOGETHER WITH 
BETA-WEIGHTS 
(The First-row Coefficients Are Based on Data of the Eighth Grade, 1931-1932, 
the Second on Ninth Grade, 1932-1933. Subscripts Indicate the First 
Letter of the Name of the Test Involved) 

















Combi- 
Tests Multiple | nation r Beta-weights 
r (raw 
score) 
Kuhlmann-Anderson and Morgan......... . 68 eis 35, . 455 
. 67 wie 49z, . 250 
Kuhlmann-Anderson and Otis............ 64 nf at .40;, .35, 
75 -" . 262, . 55, 
Ea er Ore . 64 7 46m, . 26. 
75 ane -28a, -61, 
Morgan, Otis, and Kuhlmann-Anderson. .. .70 . 69 . 36m, . Sly, . 18, 
.76 .75 17m, .182, .51. 














TasLe III.—INTERCORRELATION COEFFICIENTS OF CONSECUTIVE TESTINGS OF 
THE SamME Group oF SEVENTY-SEVEN SUBJECTS BY THE SAME TESTS 








Grades | Grades | Grades 
VII and | VIII and| IX and 
VIII IX x 
Kuhlmann-Anderson.......................4. . 65 . 66 .81 
I .75 .77 
EE . 66 . 67 
Scholarship (total credit points)................ Pe .73 . 76 
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Throughout this study, the data were treated by correlational 
analysis. For all correlation coefficients, raw scores rather than 
derived mental-age scores were used. Mental ages were considered 
undesirable because the bases for determining them differ from test 
to test, because the Otis norms appear to be somewhat high in compari- 
son to others, because the procedure used in deriving the Kuhlmann- 
Anderson norms was questioned, and because, if a test is valid when 
raw scores are used, transformation into complicated mental-age 
scores is a questionable and perhaps useless procedure, at least so far 
as test analysis is concerned. It should be kept in mind that the 
eighth-grade data are first retest data for the seventh-graders of 
1930-1931; the ninth-grade, second retest data, and so on. While 
it is true that average scores are higher on retests, validity coefficients 
are apparently not unduly distorted by the use of such data—or at 
least not in any systematic direction. 

For the convenience of the reader, the following summary of the 
correlation coefficients which were computed is presented: 


1. For data of the eighth grade, 1931-1932, the correlation coefificients of all 
test measures with the three scholarship indices (Table I). 

2. For the ninth-grade data, 1932-1933, and the tenth-grade data, 1933- 
1934, correlations of all test measures with T.C.P. (Table I). 

3. Intercorrelations of the three scholarship indices for the tenth grade 
(Table I). 

4. For the eighth- and ninth-grade data, multiple and combination (gross- 
score) correlations of the three intelligence tests with T.C.P. (Table II). 

5. Correlations of the same tests (where available) given in different years 
(Table IIT). 

6. Correlations of T.C.P. from year to year (Table III). 


Without entering into an elaborate consideration of the results, we 
may draw the following conclusions from the tables: 


1, While there are no great differences among the scholarship indices, 
T.C.P. has the highest average correlation (.56) with the three intelligence 
tests and the highest average intercorrelation with the other two scholarship 
indices (.96). 

2. For the eighth-grade data, the Morgan Mental test has the highest 
average correlation (.60) with the three indices of scholarship, the Kuhlmann- 
Anderson test ranking next, and the Otis last. For the ninth-grade data, this 
ranking is reversed, indicating that the differences in discriminating power 
among the tests are probably unreliable for the population under consideration. 
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3. The best combination of two tests for the eighth grade is the Morgan 
and the Kuhilmann-Anderson (r = .68) and for the ninth grade the Morgan 
and the Otis (r = .75). 

4. Combination of the three tests using Beta-weights yields a multiple 
correlation of .70 for the eighth grade and .76 for the ninth grade. 

5. A combination of any two tests using Beta-weights is practically as 
effective as all three together in predicting scholarship. 

6. Combination of the three tests using gross-score weights yields a 
coefficient of .69 for the eighth grade and .75 for the ninth, indicating that it 
is unnecessary to weight these tests when combining them. 

7. The Morgan test has the highest retest reliability, with an average of 
.76, Kuhlmann-Anderson is second (.71), and Otis last (.66).! 

8. T.C.P. has an average year-to-year correlation of .74. 


DISCUSSION 


The scoring system throughout any extended survey involving the 
use of more than one test might be changed with considerable advan- 
tage from the customary mental-age basis to a percentile or a sigma 
basis. The mental-age norms provided with the tests herein con- 
sidered were highly non-comparable. Ideally, where results from 
several tests are to be interpreted and combined, comparable norms 
should be constructed specifically for the given environment. While 
such a course is not always feasible, it is particularly advisable when 
the environment under consideration differs markedly from the usual 
public-school situation. On the whole, for such extended studies, age 
norms seem much more valuable than grade norms. 

If a combination of tests is to be used for predictive purposes, it is 
probably more economical to add raw scores than to add weighted 
scores or to combine mental ages derived by diverse methods. The 
totals may then be transformed to any desired basis—mental age, 
percentile, sigma, etc.—for interpretative purposes. The resultant 
score will probably be more meaningful than in the case where such 
transformations are made before combination. 

Such a study as the above serves to emphasize the need for exten- 
sive item analysis of all of the tests under consideration, so that the 
best items can be combined into a single test which is shorter and yet 
productive of greater predictive efficiency. Here again—as is rather 
generally recognized but not always practiced—if the test environment 
is considered unusual, such item selection would ideally be carried on 
in the situation for which the resulting test is intended. 





1 The intervals between testings varied somewhat from test to test and from 
person to person; also, the Kuhlmann-Anderson test is changed slightly from the 
eighth grade to the ninth. 








\w 


CO a os lle i - 


ll 9 





COMPARISON OF SCORES OF TWO POPULATIONS 
UNDER EQUALIZATION OF SCORES OF SECOND 
ATTRIBUTE 


B. F. KIMBALL 
Division of State Planning, Albany, New York 


STATEMENT OF THE PROBLEM 


Given two samples A and B, one from each of two populations 
Iand II. Each individual of the two populations possesses at least 
two attributes X and Y (such as Binet mental age and performance 
mental age). The scores in these attributes are denoted by z and y 
respectively. 

It is desired that the two samples be compared as to the scores y 
in the attribute Y (performance mental age) under equalization of 
the scores, z in the attribute X (Binet mental age), between the two 
samples, in order to determine whether or not a difference in the 
y scores of individuals of populations I and II is to be expected when 
z scores are the same. 

Equalization of scores z in an attribute X belonging to two samples 
which are to be compared with respect to the score y in a second 
attribute Y can in some cases be brought about by pairing individuals 
that possess the same score z in the attribute to be equalized. How- 
ever, in most cases this cannot be done without discarding many 
individuals from the samples with a resulting loss in the effectiveness 
(statistical efficiency) of the comparison. 

Following out lines of thought suggested by Fisher,’ it is possible 
to set up a method which is simple to apply, and which at the same 
time avoids the discarding of individuals involved in the pairing of 
individuals, groups, or mean values of the two samples. 

The method will be illustrated by the discussion of the solution 
of the following numerical problem. In Table I are shown two 
samples, A and B of twenty-one individuals each, drawn from two 
populations. For each of these individuals, the scores in Binet mental 
age and Performance mental age (measured by the Cornell-Coxe 
performance ability scale?) have been recorded. In each sample a 





1 Fisher, R. A.: Statistical Methods for Research Workers, London: Oliver and 
Boyd, Fourth Edition, 1932, Chaps. VII and VIII. 

2 See Cornell-Coxe Performance Ability Scale, Yonkers, N. Y.: World Book 
Company, 1934. 
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‘3 a” TaBLE I.—ScuHEepvu.eE or Data ror NuMERICAL PRoBLEM® 
k Sample A Sample B 
: Individuals Individuals ' 
BMA? PMA‘ BMA PMA 
é 1 12-6 12-5 1 11-3 10-5 
e 2 11-7 15-0 2 10-4 11-3 
a 3 10-5 9-7 3 10-3 12-8 
a 4 10-0 10-9 4 10-3 9-8 | 
ap 5 9-6 10-11 5 10-1 9-8 
. 6 9-2 8-0 6 9-10 11-2 
et 7 9-1 8-8 7 9-3 12-10 
i - 8 8-10 9-11 8 8-6 9-0 
ye 9 8-6 8-11 9 8-2 7-4 
he 10 8-4 10-7 10 7-10 8-1 ) 
ik 11 8-4 8-0 11 7-8 9-1 
ME: 12 8-2 10-1 12 7-8 8-5 . 
a 13 8-0 8-10 13 7-8 7-9 
on 14 7-3 7-6 14 7-2 8-6 
Fe a 15 7-0 7-6 15 7-0 7-9 
Pte 16 6-10 7-8 16 6-10 7-1 
4 ik 17 6-10 6-9 17 6-8 7-2 
=e 18 6-6 6-9 18 6-8 6-2 . 
os 19 6-2 5-8 19 6-2 6-10 ‘ 
ae os 20 5-10 5-5 20 6-0 8-1 
a 21 5-8 5-3 21 5-8 6-5 
Oh 
: Number of 
individuals...| n = 21 n = 21 n = 21 n= 21 
bs Mean value....|  8-3.71 8-9 .24 8-1.67 8-9 .90 ; 
: Standard ) 
deviation....| o = 21.42 | o = 28.05 o = 19.43 | o = 22.64 ' 
: Coefficient of | 
correlation ...| 71 = 0.896 r. = 0.821 | 
fel. * Figures to the left of dashes denote years, and those to the right, months. 
% Thus 12-6 means twelve years and six months. { 
ae | > Binet mental age. 
ab, ¢ Performance mental age. 
| | 
. 
\ 





Comparison of Scores of Two Populations 137 


significant correlation coefficient exists between BMA and PMA. 
The problem is to determine whether or not these samples indicate a 
significantly different PMA of the two populations from which they 
have been drawn for equalized BMA. 


METHOD OF ATTACK 


The first step is to set up the regression equations of the two sam- 
ples. These are written as: 


Yi(z) = H1 + bil — 4%) (A) 
Y2(z) = G2 + be(x — 22) (B) 


The essential question is to determine whether these regression 
equations based on the samples indicate that the true regression equations 


of the populations from which they have been drawn are significantly 
different or not. 


This question will be answered by the use of the regression equation 
Y(z) = 9 + b(z — 2) (1) 


obtaining by pooling the two samples. Thus Z represents the mean 
of all forty-two values of z, 7 the mean of all values of y, and 





> (z — Hy - 9) 
ie A+B 
>, (= - 2) 
A+B 


where po means a summation including ail individuals of both sam- 
A+B 


ples. The method of attack will be to work with the z and y values 
of the combined samples on the assumption that they are indicative 
of a population which is not significantly different from population I 
and II as regards the attributes X and Y. The burden of the proof 
will be to demonstrate whether this assumption be true or false. More 
precisely, it will be demonstrated as to whether the regression equa- 
tions A and B are significantly different from each other on the assump- 
tion that they apply to samples drawn from the same population. 

In order to do this the regression equation (1) is considered to be 
the expected regression equation under the above assumption, and 
the equations A and B are thought of as possible variations due to 
random sampling. 

















138 The Journal of Educational Psychology 


To compare equation A with regression equation (1) the latter 
equation is written as if it applied only to individuals of sample A. 
This is done by writing down the equation, first, in the original form, 


Y(z) = 9 + b(z — 2) 
and then subtracting from this the relation 
Y(%1) = 9 + b(2%,; — 2) 


which is the expected value of Y(z) when z = 2; as based on the 
common regression equation. The subtraction gives 


Y(z) = Y(%,) + b(@ — 4,) (2) 


The equation (2) represents the same straight line as (1), but it is 
now expressed in terms of %;, the mean of the z’s of sample A, and 
Y(Z:) which is the expected value of the mean of the y’s of sample A 
as based on the combined samples [determined from equation (1)]. 
This equation is to be compared with the regression equation obtained 
from sample A alone, which is 


Yi(z) = Hf + bi(x — %,) (A) 


Thus the comparison resolves itself into the problem of determining 
whether the differences 7: — Y(%:) and b; — b can be explained on a 
basis of random sampling. 


SAMPLING ERRORS OF REGRESSION COEFFICIENTS 


A solution of the specific problem of the determination of the 
sampling errors of the regression coefficients has been given by Fisher.' 
The variance of y is calculated by comparing y with its expected value 
rather than with the mean value. Thus the standard deviation, s, 
of y is estimated by the formula 


1 
= 53 DW - Y) 


The division is by n — 2 rather than by n because two “degrees of 
freedom” have been lost in the formula for the expected value Y. 
In determining s? all values of y from the combined samples will be 
used. This is the procedure used by Fisher in comparing two means’ 
and the same procedure would apply to the present problem as the 








1 Fisher: Op. cit., paragraph 26, pp. 123-127. 
2 Fisher: Op. cit., p. 114. 
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combined samples are regarded as a sample of a combined population. 
In making the comparisons of regression equations A and B they are 
considered as possible variations of the regression equation (1) based 
on the combined population. In other words samples A and B are 
themselves considered samples of the combined population until 
proved to be otherwise. 

The mean value #, is drawn only from individuals of Sample A, 
thus the standard error for 7; is given by s/+/n;. The standard error 


of the regression coefficient 6; is given by Fisher as s/ J Y(« — %,)?. 
A 





Here again, as one is dealing specifically only with individuals of 
sample A, the summation refers only to the twenty-one individuals 
of this sample, and the mean 7; is the mean of the z’s in this sample. 

The errors 7: — Y(Z,) and b; — b are the discrepancies of 7, and 
b, when compared with their expected values as based on the regres- 
sion equation set up from the combined samples, which are taken to 
represent a single combined population which includes populations I 
and II. The first question is whether these discrepancies are large 
enough to indicate that the regression equation A is significantly 
different from the common regression equation (1). 

Following out a similar procedure with sample B, the discrepancies 
Y(%2) — 2 and b — be might be compared with their standard errors 


s/V/n_ and s/ Jae — %.)? where the summation ps refers to 
B B 


individuals of sample B only. 

However, the final question is whether the regression equations A 
and B are significantly different or not. This is answered by adding 
the above discrepancies and applying the usual law of error for the sum 
of two errors. Thus the discrepancy (9: — Y(%:) + Y(%2) — je) 
can be considered as the total discrepancy between §; and gj. In 
reality it is the sum of the discrepancies of 7, and J: as compared 
with their expected values (keeping the z’s constant). In the case of 
the sum of the discrepancies of the second regression coefficients 
b, and be, the quantity b cancels out and the resulting sum is b; — be. 
Thus the essential question of the problem will be answered by com- 
paring the following differences with their corresponding standard 
errors: 

Difference = (71 — Y(%:) + Y(%2) — Je), with the 
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Difference = 6, — be, with the 


‘ I 1 
cite | ‘ (se — Dd - =) 8) 


A B 











where 





3? = 


1 
3 W-Y),  n=mtm 


A+B 


THE MEANING OF THE RESULTS 


Case I. No Significant Difference in Either 7 or b.—If neither of 
the differences in g or b prove to be significant it will have been demon- 
strated that the two regression equations can be considered as sampling 
variations of equation (1). In other words there is no reason to believe 
that populations I and II are different in regard to the regression of 
y on =. 

Case II. Significant Difference in Both 7 and b.—Here the regres- 
sion equations A and B cannot be considered as sampling variations 
of an intermediate regression equation. Thus they stand as the best 
representations attainable of the regression of y on z in populations 
Iand II. Subtracting the two equations, there results the relation: 


Y2(z) — Yilz) = 1 — G2 + (2%. — 0:71) + (b1 — ba)z Ss (4) 


The difference Y2 — Yas given in this equation is the expected differ- 
ence of the y’s for any given value of z. Since (b; — be) is not zero, 
this difference depends upon the value of x under discussion. 

Case III. Significant Difference in 7 but Not in b.—If the signifi- 
cant difference exists between the j’s only, the regression equations 
A and B differ only as to the mean value of y, and because of the fact 
that the b’s are not significantly different, this difference would be 
expected to be the same for all values of xz. For example, since the b’s 
are not significantly different, b: and be could be assigned the inter- 
mediate value 6 already calculated in setting up regression equation 
(1). The equations A and B could then be written 


Yi(z) = #1 + b(z — 4) 
Y2(z) = Jo + b(x — Fe) 


For any value of z say Zo, 
Yi(%0) — Y2(to) = Hi — G2 + O(%2 — 4:) 
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The expression on the right is seen to be independent of x» and repre- 
sents the expected difference between the y’s of the two populations 
when the z’s are the same. [As a matter of fact this is the same as the 
difference (J: — Y(%:) + Y(Z2) — 92), which was originally tested, as 
can be shown by substituting for Y(Z,) and Y(%2) their values as given 
by setting z = Z, and z = Z2 in equation (1).] 

Case IV. Significant Difference in b but Not in 7.—In this case, 
the fact that a significant difference between §; and f2 is not found, is 
essentially an accident due to the particular values of %; and 22 of the 
two samples used. The difference Y:(z) — Y2(x) in the expected 
values of y corresponding to a given value of z is given by equation (4). 
It has been shown that the difference between b; and bz is significant. 
Thus the quantity (6; — be) cannot be neglected in this equation. 
Hence even though 7; and #2 do not show significant differences from 
their expected values, the difference Y,(x) — Y2(x) has been shown 
definitely to vary with z, and it would accordingly be erroneous to 
state that no significant difference existed between Y;(x) and Y2(z) 
for any value of x. The inference in this case is that if another sample, 
say from population I, could be obtained whose mean z was sufficiently 
distant from Z2, the corresponding mean value of y would then be 
significantly different from je. 


SOLUTION OF NUMBRICAL PROBLEM 


The numerical calculations necessary are indicated by the formulas 
for the differences and corresponding standard errors (see formula 
(3) above) and the intermediate regression equation (1). For the 
numerical problem outlined in Table I, the results are: 


n= 21, Z, = 99.7143 ji = 105.2381 bi 1.1733 
m=21, % = 97.6667 = 105.9048 by = 0.9571 
n =42, £ =98.6905 9g = 105.5714 6 = 1.0722 


The regression equation (1) is accordingly: 
Y(x) = 105.5714 + 1.0722(2 — 98.6905) 
which can be written as: 
Y(x) = —0.245 + (1.0722)z (5) 
Setting z = 2, and z = 2, in this equation, it is found that: 


Y(Z,) = 106.669, Y(Z2) = 104.473 
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Thus: 
Y(Z,) — 9, = 1.481, fi2 — Y(%2) = 1.432 
and 
9: — Y(Z:) + Y(Z2) — 2 = —2.863 mos. 


For every one of the forty-two pairs of values of z and y of the com- 
bined samples, a corresponding Y is determined from equation (5) 
and the quantity (y — Y)? calculated. The sum of these quantities 
is: 


>(y — Y)? = 7046.83 
A+B 


and from this sum is calculated the estimated standard deviation 
s of y from its expected value Y. This is given by: 


1 7046.83 
= —— Sy — ¥)? = Ge = 176.1707 
A+B 


Thus the standard error of the difference (7: — Y(%:) + Y(%2) — Je) 
is: 

















SE = ve(2 + 1) = Sit a 4/16.7782 = 4.096 
1 2 


Since the g difference is considerably less than its standard error 
this difference cannot be considered as significant. 

In the case of the difference b; — be = 0.2162, the standard error 
involves the quantities 


> (x — 41)? = 9638.28,  >i(x — 42)? = 7926.66 
A B 


which are the squares of the standard deviations of z for each sample 
multiplied by the number of individuals in the respective samples. 
Thus the standard error in this case is: 


I l 
SE = 2 a = 
’ , br -#)? DYe- =| 
A B 


; i 
m 176.1707 se595 * ro08%8) 




















+/ .018278 + .022225 = 0.201 
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The probability that the difference 0.2162 with standard error 0.201 
be due to random sampling, as calculated from tables of the probability 
integral, is 22 per cent. This means that under the assumption that 
the difference b; — be = 0.2162 is accounted for only by random 
sampling, the probability is about one in five that this difference 
should be as large as it is. Thus this difference can hardly be con- 
sidered significant, but it is at least more suggestive of an underlying 
difference in populations I and II than the difference in the 9’s. 


DISCUSSION OF THEORY 


The feature of the theory presented in this article is that equaliza- 
tion of one attribute is brought about without the loss of any 
individuals through pairing methods, while at the same time the 
correlations of the scores in the two attributes are brought into the 
picture so as to increase materially the precision of the comparison. 
Furthermore, the usual practice in comparing two samples by pairing 
methods has been to seek to evaluate only the difference in the mean 
values (of the scores in the attribute to be measured), while the differ- 
ence in the other regression coefficients is neglected. Such a practice 
leaves out a very important element in the comparison (see discussion 
of Case IV above). 

The method of attack has been to set up an “intermediate”’ 
regression equation (equation (1) above) and calculate the standard 
deviation of y from its expected values as determined from this regres- 
sion equation. 

A method of comparing regression coefficients is suggested by 
Fisher, as mentioned above. The present author has gone further 
than the specific suggestions of Fisher, in using the regression equation 
based on both samples for the determination of the expected values of y. 

The theory rests, of course, on the assumption that the normal law 
of error can be applied to deviations of y from their expected values. 
The author made a rough test of this in the case of the numerical prob- 
lem presented above and found that the deviations y — Y roughly 
represented a normal distribution. If there is reason to believe that 
these deviations do not obey the normal law of error, the theory should 
not be applied. Perhaps, in such cases it will be found that the regres- 
sion is non-linear. 

It should be possible to develop similar methods applicable to 
non-linear regression, and to cases where two or more attributes are to 
be “equalized.””’ The author believes that such problems offer a 
field for further interesting and useful research. 
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SIGNED VERSUS UNSIGNED ATTITUDE 
QUESTIONNAIRES 


STEPHEN M. COREY 
University of Wisconsin 


In the literature dealing with attitude testing, statements are 
frequently made to the effect that a scale which is signed is thereby at 
least partially invalidated. For example, Katz and Allport® in their 
Syracuse Reaction Study frankly expected to get more valid data 
without calling for signatures on the questionnaires. Bain! hoped for 
the same result in his study of religious attitudes of college students. 
“In order to depersonalize the study,’”’ Bain wrote, ‘‘each student was 
given a number.” Similarly, attitudinal studies by Dodd,* Garrison 
and Mann,® Harris, Remmers and Ellison, Moore and Garrison,'! 
Smith,'* Uhrbrock,” and Gray,® made use of or recommended anony- 
mous questionnaires. This list is not complete but it serves to illus- 
trate a rather widespread conviction. 

The effect which a signature might have on the validity of an 
attitude questionnaire depends on many factors. In the first place, 
the institution, practice, or group toward which an attitude is being 
expressed might be of significance. If an individual were asked to 
express his attitude toward sex practices, it would seem that the truth 
might be more nearly approximated were the attitude scale not signed. 
Another factor, frequently operating more or less independently of 
the nature of the attitude being investigated, might be the subject’s 
reaction to the person administering or scoring the blanks.‘ It is 
conceivable that even with respect to sex practices signatures on 
the attitude questionnaires would not invalidate them if they were 
administered by certain individuals in whom the testees had confidence 
and from whom no punishment of any sort was anticipated. Still a 
third factor influencing the effect of a signature upon the validity of an 
attitude questionnaire is the personality of the subject. A particularly 
aggressive, self-assertive individual would studiously avoid changing 
his verbal expression of an attitude merely because he was expected to 
sign hisname. Other subjects would, of course, react quite differently. 

Despite the plausibility of theorizing about the effect of signatures 
upon results obtained from attitude questionnaires, the writer has been 
unable to locate any empirical studies on the point. Olson? has 
recently reported the effect of waiving the signature upon results 
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obtained with the Woodworth-Mathew Personal Data Sheet, but the 
factors involved in such a situation would be only roughly analogous 
to attitude questionnaires in that the Woodworth-Mathew Sheet 
involves a different type of material. Consequently, the following 
brief report is of an inquiry undertaken in an attempt to determine the 
relationship between signed and unsigned attitude questionnaires. 
The particular practice with which the scales were concerned was 


dishonesty in written examinations. In a later paper the relationship « 


of both signed and unsigned attitude scales to overt dishonest behavior 
of this sort will be presented. 

The attitude questionnaire used was constructed in the following 
manner: Each of some two hundred freshmen students was asked to 
state in one sentence an attitude toward cheating in examinations, 
either his own or one he had heard other students express. To these 
statements were added some which appeared in the rather extensive 
literature dealing with classroom dishonesty. These statements were 
then classified and all apparent duplications, which were numerous, 
eliminated. The seventy-five attitudinal statements which remained 
were then presented to the same student population with the request 
that each statement be evaluated on a nine-point scale in terms of 
whether it expressed sympathy with or antipathy toward cheating in 
examinations. The technique was patterned after that reported by 
Seashore and Hevner.** Barnhart? has presented evidence to indicate 
that this ‘‘order of merit”’ technique is comparable in reliability to the 
more time-consuming psychological method of ‘‘ paired comparison.” 
The final scale included only those fifty statements which were rated 
by eighty per cent of the judges as belonging in two adjacent categories. 

These selected statements were then mimeographed and adminis- 
tered again, twice, to the same population in order to obtain both 
signed and unsigned papers, each of which might be identified. The 
following plan was followed: Two of the mimeographed attitude scales 
were stapled together and marked with pin pricks so that they could 
be identified as belonging to one pair. The pin pricks were then 
rubbed out with a knife blade so that they were invisible unless the 
scales were held up to a strong light. Their location was further 
disguised by having the pins penetrate the papers through periods 
which appeared at the end of the mimeographed attitudinal state- 
ments. These identical double sheets were then distributed to some 
one hundred fifty college students and these instructions given: ‘“‘We 
are interested in determining what the attitude of college students is 
toward cheating on examinations. The attitude scale which you have 
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before you is not to be signed. We are at this time merely concerned 
with finding out what percentage of the students in a class agree with 
certain attitudinal statements. Here is the way you are to mark the 
statements: If you agree with the statement put a plus sign before it; 
if you agree very strongly put a plus sign with a circle around it. If 
you are undecided, or have not thought about the matter, mark the 
statement with a question mark. If you do not agree with the state- 
ment, place a minus sign before it. If you disagree very strongly, 
mark the statement with a minus sign with a circle around it.” This 
method of reacting to attitudinal statements has been recommended 


-by Likert, Roslow and Murphy.” 


After the students had finished marking the first attitude question- 
naire, they were asked to tear it off and the papers were collected in 
such a way that students would infer that identification was impossible. 
Then the subjects were told to mark the second attitude scale in exactly 
the same way, only a signature was desired on this second paper. 
From the comments and other noises that came from the class, the 
writer was very definitely given the impression that the request for 
signatures on the second papers came as a surprise. 

Because of the pin pricks on the attitude scales, it was possible 
to pair the signed and the unsigned scale for any particular student. 
These two papers were restapled and scored. The scoring was after 
the suggestion made by Likert, Roslow and Murphy,” and Grim,’ 
and will not be described here. The scores were treated statistically 
in order to bring out the following relationships: (1) The reliability 
of the signed and unsigned attitude questionnaires; (2) the relationship 
between the signed and unsigned questionnaires; and (3) the difference 
in gross scores between signed and unsigned attitude questionnaires. 

1. Reliability of the Questionnaires.—There was no significant 
difference in reliability between the signed and unsigned papers. The 
split half reliability stepped up by the Spearman-Brown Formula for 
the unsigned attitude questionnaires was +.93SD + .02 and for those 
that were signed +.90 + .02 SD. Each was quite reliable. These 
results further substantiate the claims made by Likert, Roslow and 
Murphy” for their method of scoring attitude questionnaires. 

2. The Relationship between Signed and Unsigned Attitude Question- 
naire-—The coefficient of correlation between scores made on the 
signed and the unsigned questionnaires was +.85 + .02.. Apparently 
the two measured much the same trait. Speaking relatively, those 
individuals who reacted favorably to cheating on the unsigned ques- 
tionnaire also did so on those which they signed. 
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3. Difference in Average Scores between the Scales.—The writer 
expected that the unsigned questionnaires would indicate a more 
lenient attitude toward cheating in examinations. College students 
are much less apt than their instructors to consider such behavior a 
serious offense. Knowing, as they do, however, the faculty atti- 
tude toward the practice, it seemed reasonable to anticipate that 
questionnaires that could be identified would represent an opinion 
more antagonistic to dishonesty in examinations. This was the case, 
although the difference between the mean scores on the two question- 
naires was not satisfactorily significant statistically. The mean score 
for the unsigned scale was 132.98 + 23.38 SD and for the signed scale 
129.03 + 22.54 SD. The higher score indicates a less antagonistic 
attitude toward cheating. The difference between these means with 
its standard deviation was 3.95 + 1.52. The formula employed for 
determining the standard deviation of the difference was that recom- 
mended when related (correlated) measures arecompared. Apparently 
the students were but slightly less willing to express their true opin- 
ions toward cheating on an examination when the opinions could be 
identified. 

This willingness to be relatively candid was apparent despite the 
fact that for half of the semester during which the attitude question- 
naire was administered the students in this particular course had been 
grading their own objective tests. This, one might expect, would 
lead them to be reluctant to admit their attitude toward cheating, 
particularly if it was sympathetic to the practice. The maximum 
possible gross score on the questionnaire which would be indicative 
of the greatest sympathy toward cheating, was two hundred fifty and 
the minimum gross score, indicative of an extreme antipathy toward 
cheating, was fifty. The range in actual scores on the unsigned papers 
was from sixty-one to one hundred ninety-eight with the above indi- 
cated mean of 132.98. For the signed papers the range was from 
sixty-eight to one hundred seventy-two, indicating again a tendency 
to be less sympathetic toward cheating when the opinion could be 
identified. These data incidently reveal a great diversity of opinion 
among students in regard to cheating on examinations. 


CONCLUSIONS 


It would appear from these results that in certain instances at 
least, even though the attitudes involved are subject to considerable 
censure, students are about as forthright in their expression when 
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questionnaires are signed as when they are not signed. It would not 
be justifiable, of course, to infer that the presence of signatures on 
attitude questionnaires, no matter what the circumstances, has no 
effect upon their validity. The results do indicate, however, that the 
concern of investigators over the invalidating effects of a signature 
may have been exaggerated. 
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THE RELATIONSHIP BETWEEN DISCRIMINATION IN 
MUSIC AND DISCRIMINATION IN POETRY 


MELVIN RIGG 
Kenyon College 


Some years ago a friend remarked that she could never understand 
how a high-school teacher of hers, whose taste in literature was irre- 
proachable, could have shown such poor judgment in the selection of 
his furniture. The friend was assuming the existence of something 
like a general aesthetic faculty, operating in all of the fine arts. With 
the development of discrimination tests, this opinion may be investi- 
gated by means of the correlation technique. The present paper 
treats of the relationship between music and poetry. 

The music test used was the Oregon Music Discrimination Test, 
Series I, developed by Kate Hevner, J. J. Landsbury, and R. H. 
Seashore, and described by Hevner in a publication of the University 
of Oregon, Volume IV, No. 6, Studies in Appreciation of Art, edited 
by R. W. Leighton. The music is recorded on phonograph discs, 
which are offered for sale by the C. H. Stoelting Co. There are forty- 
eight items, each consisting of two parts, a short phrase from some 
standard composer and a spoiled version of the same, rewritten with 
the purpose of producing something similar but inferior. The original 
may of course be the first or the second phrase to be presented. The 
subject is requested to decide which is the better music, and whether 
the two phrases differ in rhythm, harmony, or melody. The validity 
of the test was insured by the judgments of expert musicians, and 
the series consists only of items in which they consider the original 
to be definitely superior to the parody. The reliability of the test, a 
correlation of the odd and even items corrected by the Spearman- 
Brown formula, has been reported as .86. 

The Rigg Poetry Test was developed by the present writer. It 
consists of two forms, E and F, each consisting of thirty-five items of 
the two choice variety similar to the items of the music test. A short 
selection, two to six lines, from some standard poet is matched with 
a spoiled parody, similar to it but written with the idea of producing 
something inferior. The original may be the first or the second of 
the pair, and the student is to record which he regards as the better 
poetry. The test has been undergoing revision for a number of years. 
Items which proved to be sources of unreliability have been eliminated, 
as well as those in which college professors of English considered the 
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parody to be better than the original. An account of the development 
of the test will be published in the near future. Since both forms were 
used, the combined poetry test consists of seventy items. The relia- 
bility, computed from two earlier forms almost identical with Forms E 
and F, is .89 for the two forms combined. As will be shown later, 
however, neither the music nor the poetry test has, for the group of 
students treated in this paper, a reliability as high as those previously 
reported. 

Since people of high intelligence tend to secure higher scores on 
everything, in order to approach the real relationship between any 
two abilities intelligence must be controlled or partialed out. The 
intelligence test used in this study is the Psychological Examination 
of the American Council on Education. 

Discrimination scores are also partly the product of training. It 
may be assumed that college students have had ample opportunity 
to become acquainted with English poetry, but the situation in music 
is different, and in order to take into account variations in musical 
training a questionnaire was devised to ascertain the amount of educa- 
tion in music each student had had. For fifty-six of the seventy-one 
students two questionnaires were administered on different days, and 
a combined musical training score was computed on the basis of both. 
One point was allowed for each year of public-school music, glee club, 
choir, orchestra, band, or music appreciation courses, and four points 
for each year of private lessons or theory courses. In determining the 
combined score the attempt was made to ascertain the real situation. 
Thus if a student reports one year of piano lessons on one occasion 
and says nothing about them on another, the inference is that he is not 
lying the first time, but merely overlooked them the second, and full 
credit is given. When a different number of years is given, however, 
in the two reports, the average is taken. Although it is somewhat 
disconcerting to find that the student will not give exactly the same 
replies on two different occasions, the discrepancies are relatively 
unimportant, and the correlation between the two sets of reports is .88. 
It is assumed that the combined scores are more reliable than either 
set alone. By the Spearman-Brown formula the reliability of both 
sets combined would be .94, but since we have two reports from only 
fifty-six out of the seventy-one students, this reliability has been 
reduced proportionately to .92. 

For the music and the intelligence tests, the reliability is the correla- 
tion of the odd and even items; for the poetry test it is the correlation 
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of Form E with Form F, given on different days. All three have been 
corrected by the Spearman-Brown formula. 
The study is based upon seventy-one college men. 


TaBLe I.—Tue RELIABILITIES AND INTERCORRELATIONS OF Music (m), Portry 
(p), INTELLIGENCE (i), AND Musicat TRAINING (mt) FoR SEVENTY-ONE 
CoLLEGE MEN 

The simple correlations are presented in the lower left corner of the table, the 
reliability coefficients in the long diagonal, and the correlations corrected for 
attenuation are presented in the upper right hand corner. 

m Pp a mt 





m | .79 .44 .40 .46 






? . 50 .10 
q .09 
mt .92 











TasLe II.—PartiaL CORRELATIONS BETWEEN THE VARIABLES 
(1) When intelligence is held constant, (2) when musical training is held con- 
stant. Finally (3) the relationship between music and poetry when both intelli- 
gence and musical training are held constant. 








(1) (2) (3) 
r mp.i . 24 r mp.mt .35 r mp.imt .20 
r mmt.i_ .45 r mi.mt .4l 
r pmt.i .14 r p i.mt .44 











(These coefficients are computed from the simple correlations in the lower left 
hand corner of Table I.) 


The examination of Table I reveals that the intercorrelations 
between the variables (in the lower left corner of the table) are all low, 
ranging from —.09 to .43. Especially noticeable is the low relation- 
ship between music discrimination and musical training. The reliabili- 
ties of the various measures are given in the diagonal cells. Correcting 
for attenuation increases the values slightly, although the important 
relationship between discrimination in music and discrimination in 
poetry remains low, only .44. 

Intelligence adds a spurious relationship to music and poetry, for 
when partialed out (in Table II), the coefficient goes down. On the 
other hand, if everyone had the same intelligence, the relation between 
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music discrimination and the amount of musical training would be 
higher. Holding musical training constant, we see that the relation 
between the music discrimination scores and intelligence becomes 
higher. Strangely enough, the partialing out of musical training does 
not materially change the relation of music to poetry. 

The first and the last of the partial correlations are probably the 
most significant. They show a relation between musical and poetic 
discrimination which is surprisingly low in view of all the possible com- 
mon elements, from similarity of test procedure onup. ‘‘Intelligence”’ 
is apparently a large factor in producing the zero order correlation 
between the two tests, and it may be questioned whether ‘‘intelli- 
gence”’ is all partialed out by partialing out any one intelligence test. 
Whether or not a specific aesthetic discrimination factor would appear 
if we could deal with perfectly valid tests, we can not say, but the 
burden of proof rests with those who would assert such a relationship. 
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A FACTOR ANALYSIS OF STUDENT TRAITS 


CARL N. REXROAD 
Stephens College 


As far as the author knows, Stephens College, a junior college 
for women, is the only one in which students are regularly rated on 
a number of traits displayed in and out of the classroom, and the 
resulting records furnish an unusual set of raw data for study of the 
inter-relations of traits. For such a study, Thurstone through his 
development of a technique for making factor analysis from correlation 
matrices has added a most valuable tool. The main purpose of this 
paper is to report the Thurstone analysis, but first it is necessary to 
describe briefly the source of the correlations on which the analysis is 
made.' 

The rating of students was begun in an attempt to get a more 
complete picture of the individual than is given merely by the marks 
made in courses. Each student takes at least five courses, so that 
five instructors have opportunity to observe her in class. All students 
live in dormitories under the supervision of trained and carefully 
chosen counsellors, and each student has a faculty adviser and almost 
without exception belongs to two or more organizations having faculty 
sponsorship, so that there is an unusual opportunity for observing her 
out of class. At each of the quarterly rating periods, the various 
faculty members who have had an opportunity to observe a given 
student rate her on as many of the traits as his observations make 
possible. The scheme for rating is the conventional one of placing 
u check at the appropriate position on a line. The ten traits or items 
on which ratings are made are stated in Table I. 

These items were chosen from a long trial list and were selected 
on the basis of results from a special rating. For this special rating 
the hundred students most widely known by faculty members accord- 
ing to a statistical check were picked, and each faculty member then 
rated as many of these students as possible from his acquaintance with 
them, rating each on all trial items and also giving each a general rating 
on the basis of how nearly she approximated his idea of what an ideal 
student should be and do. From these ratings the correlation of each 
item with every other and with the general “‘ideal” rating was calcu- 





1 A full description of the rating system and of the educational philosophy on 
which it is founded will probably be published shortly. Dean Weldon P. Shofstall 
is primarily responsible for its initiation and development. 
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lated. Those items which in combination promised to give the maxi- 
mum multiple correlation with the general rating were determined 
by statistical procedure commonly employed in the selection of batter- 
ies of tests. The number of items needed was found to be ten, and 
this group when averaged and correlated with the general ‘‘ideal’”’ 
rating gave a coefficient of .96. This correlation furnishes the chief 
and about the only possible form of validation of the rating system. 
The validity has also been partially checked from time to time by 
requesting faculty members to list their best and their poorest students 
and then determining the general standing, 7z.e. the average of all 
ratings, for the individuals in these extreme groups. In contrast to 
the difficulty in ‘determining the validity of the rating system the 
reliability can be easily checked at any rating period. Last year 
when this check was made by correlating the general standing derived 
from ratings given by half the raters with that from the other half, 
the coefficient of reliability was .65 for the first rating period; and for 
the last, when faculty members knew their students better, it was .88. 
Course marks have a reliability of only .35 when calculated in the same 
manner. Again if correlation between one semester and another is 
taken as a measure of reliability, that for general standing is .92 while 
that for grade average is .55. 

The evidence given above indicates that the various correlations 
in the matrix to be analyzed are unusually trustworthy, particularly 
as correlations derived from ratings. Each correlation used in the 
analysis here presented was obtained at the end of the last school year 
and determined from data on the whole student body, 7.e. each correla- 
tion represents eight hundred fifty cases and the individuals repre- 
sented in one correlation are the same as those in any other correlation. 
The crude coefficients range from .418 to .838; it seems unnecessary to 
give the whole set here. 

Table I shows the factor loadings when analysis is made by the 
centroid method. Each item has a heavy loading of the first factor, 
and it is the degree to which a student possesses this first general factor 
that primarily determines her superiority or inferiority as a student. 
The extent to which factors 2, 3, and 4 are possessed indicates, on the 
other hand, what type of student she is rather than how good a student. 
One of the results of the development of factor analysis has been to 
bring to the foreground the question of types, for it is difficult to 
interpret results from such analyses without thinking in terms of 
types, and in this case such an interpretation seems inescapable. 
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Factor 2 differentiates those who are stronger in classroom situa- 
tions (+) from those who show up better in out-of-class situations (—). 
In the conventional system of grading the former but not the latter 
type would be given recognition. It is interesting to find that wise 
use of time is nearer the out-of-class cluster of items than to the in-class 
cluster. 

Factor 3 reveals the “‘ plodder”’ in contrast to the “personable and 
gifted’”’ student. The ‘‘plodder” (+) works without undue super- 
vision and admonition, shows marked consideration for the rights and 
interests of others—probably because she is tending to her own busi- 
ness—, uses her time well, and gets a fair grasp of subject-matter— 
probably because of work and use of time. The opposite type (—) 
shows her “‘giftedness” in grasping the broader relations of course 
material and in displaying originality in and out of class, and she shows 
her “‘personableness” in creating a favorable impression and to a 
lesser extent by entering into the desirable social life of the school and 
showing interest in her class work. 

After these three factors were taken out, the residuals were small, 
but a fourth factor was determined. It discloses a ‘‘ conscientious and 
gifted” student in contrast to a “personable” one. As indications 
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of her “‘conscientiousness,”’ the former (+) uses her time wisely and 
works without undue supervision and admonition, and as indications 
of her ‘‘ giftedness,”’ she sees the broader relations of course materials 
and displays originality in and out of class. The student who is 
merely “‘ personable” (—) creates a favorable impression, gives at least 
the impression of enthusiasm and interest in her courses, graciously 
shows consideration for others, and, shall we say, receives unexpectedly 
good marks in her courses. 

It may be of interest for the reader to make comparisons between 
the ‘“‘plodder” (3+) who is conscientious and ungifted and the ‘‘con- 
scientious and gifted” student (4+), and between the “‘ personable 
and gifted” (3—) and the “personable” (4—) types. 

As already indicated, this interpretation in terms of types seems 
inescapable, and the types revealed seem true to experience and 
observation, but one cannot help wondering whether the factors exist 
in the students or in the raters. Of course, if they exist only in the 
minds of the raters, they are nevertheless significant and important, 
for they still exert the same influence in one person’s judgment of 
another. Whether traits are associated in the one being judged or 
in the one making the judgement, it seems beyond dispute that some 
traits are regularly associated with each other, and factor analysis 
makes it possible to discover these associations. 
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BOOK REVIEWS 


Harry J. BAKER and VirGINIA TRAPHAGEN. The Diagnosis and Treat- 
ment of Behavior-problem Children. New York: The Macmillan 
Company, 1935, pp. XIV + 393. 


There can be no question that there has been a need for a compre- 
hensive treatment of the subject of behavior problems of children. 
From the publishers’ advertising blurbs, or perhaps even from the title, 
this book of Baker and Traphagen’s would seem to be such a work. 
It is, therefore, greatly disappointing to find that a more accurate title 
would have been A Manual for the Detroit Scale for the Diagnosis of 
Behavior Problems. The fourteen chapters are divided into four groups 
of which only the first is not entirely concerned with the Detroit Scale. 

The basic idea of the Scale is sound. In order to make the exami- 
nation and evaluation of evidence in the child’s life history more 
systematic, the authors have divided the materials usually desired in a 
case history into sixty-six factors. There are thirteen concerning 
health and physical condition; thirteen concerning personal habits 
and recreational factors; eleven on personality and social factors; 
eighteen on parental and physical factors of the home; and eleven on 
home atmosphere and school factors. The selection of items and their 
arrangement is admittedly arbitrary, but experience has shown that 
the Seale is nearly optimum in practical use. For each of the sixty-six 
items the directions for administering give the basic questions for the 
child and its parents, and also additional questions for more detailed 
analysis. Each item is scored on a five-point scale; the sum of these 
scores constitute a total score for the Scale. 

A group of one hundred eighty-one non-behavior-problem children 
had a mean score of two hundred eighty-five, while a group of one hun- 
dred eighty-nine behavior-problem children had a mean score of 
two hundred twenty. This difference of sixty-five points on the total 
Scale score was statistically significant. When the differences between 
the two groups on each item were calculated it was found that they 
varied from 2.46 for ‘‘ general behavior’’ to —.08 for ‘“‘home language.” 
Attention is called to the fact that variation among the differences 
means that different items have different values in diagnosis—yet 
no attempt is made to weigh the total scores on the basis of the order 
of differences. This would seem to be a very desirable improvement 
in the quantitative scoring. 
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For clinical use it is probably true that a qualitative analysis of the 
responses is of greater importance than the quantitative score. That 
the authors recognize this is indicated by Part III which is devoted 
to ‘‘interpretation of the sixty-six factors.’”? This part might well have 
been the most valuable section of the book. Unfortunately it is so 
carelessly written that a careful worker could never depend upon it. 
Such a statement is harsh criticism, but I believe it can be substanti- 
ated. For example, there are many instances of broad dogmatic 
statements which are certainly doubtful; one of which we will quote: 
“The system of a mother who is unhappy and nervous or sickly during 
her pregnancy contains poisons which may affect the system of the 
embryo. When the child of such a mother is born, he is apt to be 
delicate, sensitive, and nervous. Thousands of case histories of 
nervous, unstable, delicate, children show this correlation” (p. 119). 
One immediately would like to know where these case histories are, 
and whether there are no other factors to account for the child’s 
condition. : 

The authors, in a similar categorical fashion, quote incidence 
figures, but seldom if ever give the source. They say that deafness is 
congenital in one-third of the cases and acquired by accident or disease 
in two-thirds. These data are apparently from the United States 
Census. However, Richardson and Shambaugh, in a study of over 
three thousand deaf children, give incidences exactly the reverse of 
these. Why do these authors select one set of data and neglect the 
other? How is the reader to know that such selection has been made? 

A third point that is inexcusable is internal inconsistency. On 
page 120 the mortality from “infantile paralysis” is said to be high, but 
on page 154 it is said that poliomyelitis “‘is seldom fatal!’ Another 
instance: On page 61 Terman and Hocking’s data are quoted on the 
hours of sleep for children with the implication that it is a normative 
table; on page 195 a White House Conference Report is similarly 
quoted. Unfortunately the two sets of norms do not agree! 

Final evaluation of this scale must come through use. The authors 
have introduced an important new method, but to this reviewer, at 
least, it does not appear that they have done justice to the Scale in this 
public description of it. C. M. Lovutrtir. 

Indiana University. 
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Louisa C. WaGonerR. The Observation of YoungChildren. New York: 
McGraw-Hill Book Co., 1935, pp. 297. 

This laboratory manual for use in child development courses having 
nursery school facilities fills a long felt need exceptionally well. Text 
material is limited to a brief discussion of the aims of the nursery 
school, the means it employs in attaining those aims and to specific 
instructions to students regarding nursery school procedures and how 
to observe in the nursery school. The manual is prepared in the loose- 
leaf style so students may fill out pages as assignments, turn them in, 
and later file them in note books. Spacing is adequate and well- 
planned with reference to the questions. 

The ninety-six exercises are grouped according to the following 
topics: the nursery school, its staff, equipment and records; activities; 
special phases; attitudes and personality; language; judgment, imagina- 
tion and memory; emotional development; adult-child relationships; 
and problems in child management. The questions are provocative 
and should prove stimulating to careful and accurate observation on 
the part of the student. The author has intentionally included many 
more exercises than can be covered adequately in a single course on the 
basis of the usual time available for observation. The variety of 
material thus supplied allows the instructor opportunity to choose 
exercises which fit the local situation best or which cover topics that 
he wishes to emphasize in the course. An excellent bibliography 
classified by topic is provided. Instructors in courses in child develop- 
ment who have nursery school facilities should welcome this excellent 
teaching aid. DorotTHEeA McCartry. 

Fordham University. 


CuHarues H. Jupp. Education as Cultivation of the Higher Mental 
Processes. New York: The Macmillan Co., 1936, pp. VII + 206. 


Is acquisition of facts or learning to think more important in 
education, and does the former foster the latter? Even though most 
educators will admit that training the higher mental processes is 
important, nevertheless, most teaching does continue to emphasize 
facts rather than how to think. Professor Judd strongly opposes this 
trend, insisting that learning should be appropriate for the cultivation 
of the higher mental processes. ‘To support this view evidence is cited 
from experiments and analyses. 

Emphasis upon recall of information appears to hinder develop- 
ment of adaptive thinking. It is pointed out that symbolic thinking, 
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as illustrated in the advanced stages of arithmetical reasoning, depends 
upon an understanding of the laws of the symbolic system and is 
characteristically different from perceptual experience. There is 
evidence to indicate that unless satisfactory methods of thinking are 
established early in school, permanent improvement in study habits 
is unlikely to occur. 

Issue is taken with certain points of emphasis in Thorndike’s 
educational psychology and Ben Wood’s conclusions concerning 
measurement in higher education. These criticisms and certain other 
points made by the author will not find ready acceptance by all readers. 
Nevertheless Professor Judd has produced an extremely stimulating 
book, one that should be read by all who have a vital interest in the 
field. As one observes the nature of most contemporary teaching it 
is easy to agree that we need more emphasis upon “‘cultivation of the 
higher mental processes” in education. Miss A. TINKER. 
University of Minnesota. 








